Neurofibromatosis gene

ABSTRACT

The invention relates to the gene involved in the von Recklinghausen neurofibromatosis (NF1) disease process and to the identification, isolation and cloning of a nucleic acid sequence corresponding to the gene. The invention further relates to the NF1 gene product and sequence and antibodies raised thereto. The invention also relates to methods of screening for NF1 and NF1 diagnosis, as well as conventional treatment and gene therapy utilizing recombinant technologies.

RELATED APPLICATIONS

This is a 371 filing of International Application No. PCT/US91/04624,filed Jun. 28, 191, by Collins et al., which is a continuation-in-partapplication of U.S. Ser. No. 07/547,090 entitled “NeurofibromatosisGene,” filed Jun. 29, 1990, by Collins et al, now abandoned.

This invention was made in part with governmental support under NationalInstitute of Health grants NS23410 and NS23427. The government may havecertain rights to this Invention.

FIELD OF THE INVENTION

The present invention relates generally to the gene involved in the vonRecklinghausen neurofibromatosis (NF1) disease process and, moreparticularly, to the identification, isolation and cloning of a nucleicadd sequence corresponding to the gene. The present invention furtherrelates to the NF1 gene product and sequence and antibodies raisedthereto. The present invention also relates to methods of screening forNF1 and NF1 diagnosis, as well as conventional treatment and genetherapy utilizing recombinant technologies.

BACKGROUND OF THE INVENTION

Von Recklinghausen neurofibromatosis (NF1), often referred to as the“elephant man disease,”¹ is one of the most common autosomal dominanthuman disorders, affecting about 1 in 3,000 of the general population.The disease primarily involves neural crest-derived tissue and Ischaracterized by café-au-lait spots, neurofibromas increasing in sizeand number with age, learning disabilities and mental retardation,seizures, and an increased risk of malignancy. The expression of thedisease is extremely variable in its symptoms and severity, and thespontaneous mutation rate is remarkably high, with about 30 to 50% ofall cases representing new mutations. Clinical diagnosis of NF1 has beenrelatively difficult early in life, due to the variability of thesymptoms and their delayed appearance.

¹ ¹ An NIH panel, however, recently concluded that “Elephant Man” J.Merrick did not actually suffer from neurofibromatosis, but from anextremely rare disease known as the Proteus syndrome.

Direct cloning of the NF1 gene has not been possible due to the lack ofa consistent abnormality in NF1 tissue which would provide sufficientinformation about the gene product The remaining alternative has beenpositional cloning of the gene, utilizing its chromosomal map positionrather than its functional properties. Using this approach, geneticlinkage analysis led to the assignment of the NF1 gene to the proximallong arm of chromosome 17. Subsequent collaborative multipoint mappingefforts narrowed its genetic location to about 3 centiMorgans of17q11.2. A combination of somatic cell hybrid techniques, linking clonesand pulsed field gel electrophoresis (PFGE) applied to two unrelated NF1patients having balanced translocations t(1;17) and t(17;22), withbreakpoints approximately 60 kb apart on chromosome 17, further narrowedthe location of the gene to a few hundred kilobases of chromosome band17q11.2. See Collins, F. S. et al., Trends in Genetics 5:217-221 (1989).

The first NF1 candidate gene was identified in mice as a site ofretroviral integration in murine leukemia. See Buchberg, A. M. et al.,Oncogene Research 2:149 (1988). It has now been found, however, that thehuman homolog EVI2A, previously named EVI2, which maps between the NF1breakpoints, is not interrupted by the aforementioned NF1translocations, and no abnormalities in this gene have been identifiedas the cause of NF1. Similarly, EVI2B, previously named NF1-c2, a genenewly identified in the course of this invention by chromosomal walkingand jumping, mapped between the NF1 breakpoints, was not interrupted bythe NF1 translocations and exhibited no abnormalities in NF1 patients.It thus became clear that the NF1 gene had not yet been identified.

Recently, a gene was identified by positional cloning showing mutationsin individuals affected with NF1. Cawthon, R. et al., Cell 62:193-201(1990); Viskochil, D. et al., Cell 62:187-192 (1990); Wallace, M. R. etal., Science 249:181-186 (1990). Further cloning and partial sequenceanalysis demonstrated that the gene product contains a domain showingapproximately 30% similarity to the catalytic domains of yeast IRA1 andIRA2 proteins and the mammalian GTPase activating protein (GAP).Buchberg, A. et al., Nature 347:291-294 (1990); Xu, G. et al., Cell62:599-605 (1990). GAP is a cytosolic protein that catalyzes theconversion of active GTP-bound ras p21 to the inactive GDP-bound form.Trahey M. et al., Science 238:542-545 (1987). It was subsequently shownthat the GAP related domain of the NF1 gene product can also interactwith human and yeast RAS p21 to down-regulate is activity. Ballester, R.et al., Cell 63:851-859 (1990); Martin, G. A. et al., Cell 63:343-349(1990); Xu, G. et al., Cell 63:835 -841 (1990). Our previous reports ofcDNA cloning of NF1 contained in parent application U.S. Ser. No.547,090 were based on partial fragments of the transcript which isapproximately 13 kb by Northern blotting. Wallace, M. R. et al., Science249:181-186 (1990). The entire coding region of the NF1 gene has nowbeen cloned and sequenced, the gene product identified and antibodiesraised thereto, as described and claimed herein.

SUMMARY OF THE INVENTION

The entire coding region of the gene involved in von Recklinghausenneurofibromatosis (NF1 gene) and a ubiquitously expressed largetranscript (NF1 LT) of approximately 13 kb have been isolated, cDNAcloned and sequenced as set forth in FIGS. 16 and 19 and the SequenceListing. Analysis of the sequences revealed an open reading frame of2818 amino acids, although alteratively spliced products may code fordifferent sized protein products. The gene extends for a minimum of 270kb on chromosome 17, with its promoter in a CpG rich island. The NF1sequence is highly conserved and shows homology to the GTPase activatingproteins family (GAP). The gene is interrupted by both NF1translocations and altered in a new mutation NF1 patient, and containsprevious candidate genes EVI2A (EVI2) and EVI2B (NF1-c2) within it.Antibodies which specifically recognize the NF1 gene product hale beengenerated against both fusion proteins and synthetic peptides. Initialcharacterization of the NF1 gene product by both immunoprecipitation andWestern blotting has revealed a unique protein of approximately 250 kDa.The protein has been found in a variety of human tissues and cell linesand is also present in rat and mouse tissues.

With the identification and sequencing of the gene and its correspondinggene product, nucleic acid probes and antibodies raised to the NF1 geneproduct can be used within the scope of the invention in a variety ofhybridization and immunological assays to screen for the presence of anormal or defective NF1 gene or gene product. Functional assays tomeasure levels of gene function can also been employed for diagnosis orto monitor treatment. Assay kits for such screening and diagnosis inaccordance with the principles of the invention can also be provided.

Patient therapy through supplementation with the normal NF1 protein,whose production can be amplified using genetic and recombinanttechniques, or with its functional equivalent, is also possible. Inaddition, NF1 may be cured or controlled through gene therapy bycorrecting the gene defect in situ or using recombinant or othervehicles to deliver a DNA sequence capable of expression of the normalgene product to the patient. Treatment of non-NF1 tumors of the nervoussystem and growth stimulation of nervous tissue is also contemplated.

Other features and advantages of the present invention will becomeapparent from the following description and appended claims, taken inconjunction with the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS AND TABLES

FIG. 1 is a schematic of the NF1 region drawn to scale in kilobases.

FIG. 2A is a Southern blot map of cDNA clones B3A and P5.

FIG. 2B is a schematic map of part of the NF1LT cDNA including cDNAclones P5 and B3A.

FIG. 3 is a Southern blot of human, mouse and hybrid DNA using the 5′end of the P5 probe, illustrating that NF1LT spans the t(17;22)breakpoint.

FIG. 4A is a Northern blot of various human tissues using the P5 probe.

FIG. 4B is a Northern blot of two independent melanoma cell lines usingthe P5 probe.

FIG. 4C is a PCR analysis of RNA expression in various human tissues andcell lines using primers A and B.

FIG. 4D are PCR results from RNA from mouse-human hybrid and parentalcell lines using primers C and D which do not amplify mouse NF1LT RNA.

FIGS. 5A-D are Southern blots of the DNA of a new mutation NF1 patient,his parents and normal human DNA, using probe P5, which show a 0.5 kbinsertion in the patient.

FIG. 6 is a schematic partial exon map of NF1 LT (SEQ ID NO:3 and SEQ IDNO:4).

FIG. 7 is a schematic diagram representing the cDNA walk in the NF1gene.

FIG. 8 is a PAGE analysis of the primer extension off of frontal lobehuman brain total RNA and melanoma cell mRNA.

FIG. 9 shows the extent of the NF1 transcript on the genomic map ofchromosome 17 (SEQ ID NO:5and SEQ ID NO:6).

FIG. 10 is a map of the NF1 gene product with the positions of thefusion proteins and Synthetic peptides.

FIG. 11 are SDS-PAGE results showing the expression of pMAL.c fusionproteins (SEQ ID NO:7 through SEQ ID NO:11).

FIG. 12 are (A) SDS-PAGE results after immunoprecipitation and (B) aWestern blot analysis illustrating that H and D peptide antiserarecognize a 250 kDa protein (SEQ ID NO:12).

FIG. 13 is a Western blot illustrating that D peptide antiserumrecognizes an independently generated fusion protein.

FIG. 14 are SDS-PAGE results illustrating that pMAL.B3A fusion proteinand H peptide antisera recognize the same proteins precipitated by Dpeptide antiserum.

FIG. 15 are SDS-PAGE results illustrating that pMAL.B3A fusion proteinantiserum detects the NF1 protein in a varied of adult mouse tissues.

FIG. 16 is a partial nucleotide sequence of NF1LT cDNA with itscorresponding amino acid sequence (SEQ ID NO:13).

FIG. 17 is the cDNA sequence of the 5 portion of the NF1 transcript (SEQID NO:5).

FIG. 18 presents alternate 5′ ends found in cDNA clones (SEQ IDNOS:6-8).

FIG. 19 is the complete amino add sequence of the NF1 gene product (SEQID NO:2) as deduced from the open reading frame of sequenced clones (SEQID NO:1).

DETAILED DESCRIPTION OF THE INVENTION

The NF1 gene, referred to as NF1LT prior to confirmation in SpecificExample 1, has been identified, and its coding region cloned andsequenced. Partial and complete coding sequences of the NF1 gene andtheir corresponding amino acid sequences are depicted in FIGS. 16 and 19and the Sequence Listing appended hereto. Analysis of the sequencesrevealed an open reading frame of 2818 amino acids, althoughalternatively spliced products may code for different sized proteinproducts. The gene extends for a minimum of 270 kb on chromosome 17,with its promoter in a CpG rich island. The NF1 sequence is highlyconserved and shows homology to the GTPase activating proteins family(GAP).

The conclusion that the putative gene (NF1LT) is the gene involved inNF1, i.e. the NF1 gene, was based on several lines of evidence which aredetailed below. First, this gene was dearly disrupted by the t(17;22)breakpoint, as shown at the DNA and RNA level. RNA analysis of the DCR1human-mouse hybrid indicated that the NF1 LT gene was functionallydisrupted by the t(1;17) NF1 translocation as well. Even more compellingevidence was the identification of a 0.5 kb insertion in a new mutationNF1 patient. This insertion was located at least 10 kb away from thepreviously proposed genes EVI2A (EVI2) and EVI2B (NF1-c2). Thesecandidate genes also failed to show abnormalities in NF1 patients andare apparently located in introns of NF1LT on the antisense strand. Thelarge transcript size of NF1LT was also consistent with the highmutation rate of approximately 10.⁻⁴/allele/generation.

To further elucidate the normal function of the NF1 gene product, and todetermine the pathophysiologic basis by which alterations in the genegive rise to, neurofibromatosis, it was desirable to develop specificantibodies which recognize the NF1 protein product. As a parallelexample, the understanding of molecular pathology in Duchenne musculardystrophy (DMD) was greatly enhanced by the development of antibodiesagainst the DMD gene product, dystrophin. Hoffman, E. P. et al., Cell63:835-841 (1990). Once these antibodies were generated, it becamepossible to localize the protein in cells and to correlate abnormalitiesin DNA with protein alterations. Bonilla, E. et al., Cell 54:447-452(1988) and Lidov, H. G. W. et al., Nature 348:725-728 (1990). Thedistinction between Becker and Duchenne muscular dystrophy can now bemade on the protein level, providing a reliable diagnostic tool forpatient evaluation. Hoffman, E. P. et al., New England J. Med.318:1363-1368 (1988).

In order to study the NF1 gene product, antibodies were raised againstboth fusion proteins and synthetic peptides. Initial characterizationusing two anti-peptide antisera and one fusion protein antiserumdemonstrated a unique protein of approximately 250 kDa by bothimmunoprecipitation and Western blotting. This protein was found in alltissues and cell lines examined and is detected in human, rat and mousetissues. To demonstrate that these antibodies specifically recognize theNF1 protein, additional fusion proteins were generated which containedthe sequence against which the synthetic peptide antisera had been made.Both peptide antisera recognized their respective fusion proteins.Immunoprecipitates using the peptide antisera were shown to recognizethe same protein detected by immunoblotting with either the otherpeptide antiserum or the fusion protein antiserum. Examination of adulttissue homogenates by Western blotting demonstrated the presence of theNF1 protein in all tissues using antisera which recognize spatiallydistinct epitopes.

SPECIFIC EXAMPLE 1

I. Isolation and Characterization of the NF1LT Transcript

A. Isolation and Cloning

Two unique strategies were utilized to derive cDNA clones which defineNF1LT. Initial experiments with the end-of-jump of clone EH1 obtained bychromosome jumping showed that a single-copy 1.4 kb EcoRi-HindIIIsubfragment, which lies just telomeric to the t(17;22) breakpoint, isconserved across species and is therefore a potentially useful probe insearching for transcripts in the region. This probe was used to screen ahuman peripheral nerve cDNA library constructed from human cauda equinaRNA by Marion Scott and Kurt Fischbeck of the University of PennsylvaniaMedical Center. This library is partially oligo-dT-primed and partiallyrandom-primed with the inserts cloned into the EcoRi site of λZAP(Stratagene, La Jolla, Calif.). 700,000 clones were plated on XL1 -Bluecells and screened using the methodology described by Maniatis, T. etal., Molecular Cloning: A Laboratory Manual. Cold Spring Harbor: ColdSpring Harbor Laboratories (1982). The screening of this cDNA libraryresulted in the isolation of clone P5, which has an insert of 1.7 kb asshown in the Southern blot of FIG. 2A described below.

In an alternative approach, transcripts were sought using the YAC cloneA113D7, part of an overlapping contig of clones from this region. Asthis YAC contains the entire breakpoint region, direct screening of cDNAlibraries with this probe, although technically difficult, would beexpected to yield an entire set of expressed transcripts.Field-inversion gel electrophoresis of YAC genomic DNA was performed ina 1.0% low melt agarose gel under conditions that separated the 270 kbYAC from the yeast chromosomes (160 volts, 65 hour run at 4° C. with aforward ramp of 6-48 seconds and a reverse ramp of 2-16 seconds). TheYAC was cut out of the gel, equilibrated in digestion buffer, anddigested with HincII. After re-equilibration in TE, the agarose wasdiluted in three volumes of water and melted at 68° C. Radiolabeling ofthe YAC was done in the diluted low melt agarose according to Feinberg,A. P. et al., Anal. Biochem. 137:266 (1984), except that 0.5 mCi wasused in a final volume of 500 μl. After removing unincorporated countsusing a spin column, probe was preannealed with human placental DNA at afinal concentration 1 mg/ml in 0.1 M NaCl for 15 minutes at 65° C. Phagelifts on nitrocellulose filters were prehybridized overnight in 6× SSC,2× Denhardt's, 1 mM EDTA, 0.5% SDS. Hybridization was in the samesolution for 48 hours. Filters were washed to a final stringency of 0.2×SSC, 0.1% SDS at 65° C. Clone B3A was isolated using the proceduredescribed above from a B-lymphoblast cDNA library described by Bonthron,D. J. et al., J. Clin. Invest. 76:894 (1985) and contained a 0.8 kbinsert. Subsequent analysis revealed that P5 and B3A overlap as shown inthe Southern blot of FIG. 2A described below.

Referring now to FIG. 1, a schematic of the NF1 region drawn to scale inklilobases is shown. The orientation on chromosome 17 is shown and thetranslocation breakpoints are indicated by arrows. The two anonymousprobes shown are 17L1 and 1F10 described by Fountain, J. W. et al.,Science 244:1085 (1989) and O'Connell, P. et al., Science 244:1087(1989). The previously described candidate genes EVI2A (EVI2) and EVI2B(NF1-c2) are shown above the map line, with the arrow heads indicatingthe direction of transcription. The jump clone EH1 is indicated by thearc. The 270 kb YAC A113D7 is part of a contig covering this region andwas obtained from the Center for Genetics in Medicine at WashingtonUniversity by screening their YAC library with a probe derived fromcosmid 1F10.

Referring now to FIG. 2, FIG. 2A shows Southern blot mapping of cDNAclones P5 and B3A. Genomic DNA from YAC clone A113D7 was digested withEcoRI, separated by gel electrophoresis and transferred to GeneScreen.Hybridization and wash conditions have been previously described byDrumm, M. L et al., Genomics 2:346 (1988). The filter was probedsequentially with clone B3A (lane 1) and P5 (lane 2), with the filterbeing stripped between hybridizations. Each clone maps to specific EcoRIfragments in the YAC, the sizes of which are shown in kilobases. Asshown in FIG. 2A, clone P5 contains a 1.7 kb insert and overlaps cloneB3A.

Referring now to FIG. 2B, a schematic map of part of the NF1LT cDNAshows the extent of the open reading frame and 3′ untranslated regions,as well as cDNA clones P5 and B3A.

B. NF1LT Crosses the t(17;22) Breakpoint

To determine whether the NF1LT locus was interrupted by one or bothtranslocation breakpoints, the 5′ end of P5 was used as a probe againsta Southern blot of the translocation hybrids. See Schmidt, M. A. et al.,J. Med. Genet. 28:771 (1987); Ledbetter, D. C. et al., Am. J. Hum.Genet. 44:20 (1989); Menon, A. G. et al., Genomics 5:245 (1989);Collins, F. S. et al., Trends in Genetics 5:217 (1989). These hybridscontain chromosome 17 sequences telomeric to the breakpoints, with thet(1;17) break (hybrid DCR1) occurring 60 kb centromeric to the t(17;22)break (hybrid NF13).

Referring now to FIG. 3, for this Southern blot, mouse, normal human andhybrid DNAs were digested with EcoRI, transferred to Hybond N, andhybridized as previously described by Wallace, M. R. et al., NucleicAcid Res. 17:1665 (1989). The final wash was 1×SSC/0.1% SDS at 65° C.for 20 minutes. The probe was a 0.8 kb 5′ end fragment from the PSBluescript subclone, extending from the vector polylinker to the BstEIIsite shown in FIG. 2B. In FIG. 3, M represents mouse DNA; H, normalhuman DNA; 17, DNA from hybrid MH22-6, a mouse cell line containinghuman chromosome 17 as its only human material described by Tuinen, etal., Genomics 1:374 (1987); DCR1, the mouse hybrid containing the der(1)of t(1;17); and NF13, the hybrid containing der(22) of t(17;22).

The results in FIG. 3 show that the P5 5′ end probe detects two humanEcoRI fragments of 15 and 4.0 kb. Two bands of 8.0 and 2.5 kb are seenin mouse DNA, indicating that this transcript is strongly conserved. Inthe translocation hybrids, DCR1 contains both human bands, but NF13lacks the 4.0 kb band, indicating that part of NF1LT lies between thesebreakpoints.

Although the Southern blot presented in FIG. 3 appears to indicate thatthe P5 partial cDNA clone extends across the t(17;22) translocationbreakpoint in an NF1 patient, subsequent experiments have shown this tobe incorrect; the entire P5 clone actually lies telomeric to thisbreakpoint The apparent absence of the 4.0 kb Eco RI genomic fragmentfrom the der(22) chromosome in FIG. 3 is due to the fact that thetranslocation falls within this 4.0 kb interval, and the resultingbreakpoint fragment happens to be precisely the same size (15 kb) as theother human genomic band in this lane of the blot.

Our subsequent additional cDNA cloning efforts indicate that thet(17;22) break interrupts the NF1LT cDNA 681 bp 5′ to the end of P5(between exons 4 and 5 in the numbering system of Cawthon, R. M. et al.,Cell 62:193 (1990)). The evidence for this is illustrated in FIG. 1 ofWallace, M. R. et al., Science 250:1749 (1990) which shows a Southernblot performed with genomic DNA amplified with PCR primers for exon 4and exon 5. The PCR primer sequences were located in the intronsadjacent to the exons and are given in Cawthon, R. M. et al., Cell62:193 (1990). The source of DNA was normal human; NF13 mouse humanhybrid containing the der(22) chromosome from the t(17;22) NF1translocation; mouse; and water as a negative control.

C. RNA Analysis

To determine the transcript size of NF1LT cDNA, clone P5 was used toprobe Northern blots of RNA from a variety of tissues. As shown in FIGS.4A and B, an approximately 13 kb transcript was visualized in brain,neuroblastoma and kidney tissue, and two melanoma cell lines. Ahybridization signal of similar size was visible in RNA from severalother tissues, although the bands were less discrete, probably due todegradation of the large transcript. Because of differences indegradation between RNA samples, it was not possible to judge from theband intensities the relative level of expression in different tissues.

In order to survey the pattern of expression of NF1LT in differentnormal and pathologic tissues, RNA Polymearse Chain Reaction (PCR)analysis as described by Gibbs, R. A. et al., PNAS (USA) 86:1919 (1989)was performed of a number of tissues using primers from the translatedregion. As shown in FIG. 4C, expression of NF1LT was apparent in a widevariety of human tissues, including those giving signals on Northernblots, as well as immortalized B-lymphoblasts (WBC) (both NF1 andnon-NF1), NF1 skin fibroblasts, spleen, lung, muscle, thymoma,neuroblastoma, an NF1 neurofibrosarcoma cell line, a colon carcinomacell line, and breast cancer. In other experiments, expression was alsodetected in colon, thyroid, parathyroid adenoma, lymphoma, endometrialcarcinoma, K562 erythroleukemia cells, and normal skin fibroblasts.Leukocyte contamination of the solid tissue samples could potentiallyaccount for the PCR signals, but the fact that RNAs from various celllines show expression indicates it is likely that NF1LT is widelyexpressed.

D. Expression of NF1LT Abolished in the t(1;17) Rearrangement

The primers used in the above experiments amplified a band of similarsize in mouse RNA, showing that this translated region is conserved. Inorder to determine if the NF1LT gene was inactivated in three hybridcell lines from NF1 patients with cytogenetic rearrangements, one PCRprimer from the 3′ untranslated region was included, with theexpectation that this area would be less conserved across species. Thethree hybrid cell lines included in this experiment were DCR1 and NF13,described above, and del(17). This last hybrid contains the deletedchromosome 17 from an NF1 patient with a deletion of part of theproximal long arm of chromosome 17. As shown in FIG. 4D, the expectedproduct is seen in B-lymphoblasts (WBC) and human brain, but not inmouse fibroblast RNA. A hybrid containing the normal human chromosome 17does express human NF1LT, but no product is visible with either NF1translocation hybrid or with the del(17) hybrid, indicating that theserearrangements abolish expression.

II DNA Abnormality in a New Mutation NF1 Patient

Identifying mutations in NF1 patients is crucial to correctlyidentifying the gene from among candidate genes. New mutation NF1patients are particularly helpful, since comparison of their DNA withthat of their parents allows the distinction between causative mutationand polymorphism to be more readily made. Since pulsed-field gel studiesby several groups with probes in the NF1 region have failed to reveallarge rearrangements (see Fountain, J. W. et al., Science 244:1085(1989); O'Connell, P. et al., Science 244:1087 (1989)), patient DNA wasexamined at the Southern blot level with NF1LT. DNA from 35 individualswith NF1 was analyzed with the probe P5, and the autoradiograms wereexamined for abnormal bands or obvious differences in band intensity.

A single patient showed a difference with this assay. This NF1 newmutation patent is a 31-year-old white male who exhibits no café-au-laitspots or axillary freckling, but has macrocephaly, Lisch nodules, andmultiple cervical nerve root tumors (shown histologically to beneurofibromas) requiring surgical debulking. He had a single cutaneousneurofibroma removed several years ago, and has no evidence of acousticneuroma. His parents were carefully examined and displayed no featuresof NF1 or NF2. FIGS. 5A-D display Southern blots of DNA from thispatient (lane 2) and his parents (lanes 1 and 3) using four enzymes andprobe P5.

With EcoRI (FIG. 5A), the patient had a normal pattern except that his4.0 kb allele was fainter than expected and he also demonstrated anabnormal fragment of 4.5 kb, with approximately the same intensity asthe 4.0 kb band. This 4.5 kb band was not present in the parents.Similarly, with three other enzymes, an abnormal fragment approximately0.5 kb larger than expected was seen. For Pst I, the involved 12 kbfragment was the same one previously shown to contain the t(17;22)breakpoint. These abnormal bands were not seen in the 34 other NF1patients (12 of whom represent new mutations) or 27 unaffectedindividuals. As a confirmation, the family members were resampled, andthe same results were obtained. The family was studied with three highlypolymorphic VNTR probes pYNZ22, pYNH24 and pEKMDA2 as described byNakamura, Y. et al., Science 235:1616 (1987), and there was noindication of incorrect paternity.

Collectively, this data indicates that this patient possesses a novelmutation which appears to be an insertion of approximately 0.5 kb closeto or within an exon of NF1LT. This new mutation, along with theevidence showing that the t(17;22) breakpoint interrupts the gene, andthe PCR data showing that NF1LT expression is absent in the t(1;17)hybrid DCR1, indicates that NF1LT is the NF1 gene.

III. Partial Nucleotide Sequence and Analysis of NF1LT

Complete sequencing of P5 and B3A revealed an overlap of 507 bp, withthe combined sequence being 2012 bp as shown in FIG. 16. A single openreading frame was identified extending from the beginning of P5 acrossnearly the entire sequence, which shows that B3A is located at the 3′end and that transcription occurs toward the telomere. At the 3′ end ofB3A, a stop codon occurs 181 bp from the end. However, nopolyadenylation signal or poly(A) tail was evident, which implied thatpart of the 3′ untranslated region was missing. Comparisons of this DNAand protein sequence with the entries in Genbank (see Burks, C. et al.,Meth. Enzymol. 183:3 (1990)), and the NBRF and SWISS-PROT databases (seeBarker, W. C. et al., Meth. Enzymol. 183:31 (1990) and Kahn, P. et al.,Meth. Enzymol. 183:23 (1990)), failed to show significant similaritywith any known sequence. A hydropathy plot of the amino acid sequencerevealed a primarily hydrophilic polypeptide. See Kyle, J. et al., J.Mol. Biol. 157:105 (1982). Other analyses failed to reveal any otherrecognizable motifs except for two potential N-glycosylation sites andthree possible nuclear localization signals depicted in FIG. 16.

Referring now to FIG. 6, the postulated genomic structure of NF1 isschematically represented. The 20 kb cloned portion of the NF1LT cDNAcontains at least 6 exons. The 5′ exon of the cloned transcript liesbetween the NF1 translocation breakpoints, within 12 kb of t(17;22). Thesize and number of genomic fragments detected by NF1LT probes P5 and B3Aon Southern blots indicated that the 2.0 kb of cloned cDNA spaned atleast 33 kb of genomic DNA. Thus it was unlikely that the remaining 5′exons could lie entirely between the translocation breakpoints.

Based on the observation that CpG islands often lie at the 5′ ends ofgenes, especially housekeeping genes, the next 5′ CpG island was apotential site for the promoter of NF1. This island has been previouslycloned in the NotI linking clone 17L1 shown in FIG. 1 and described byWallace, M. R. et al., Nucleic Acids Res. 17:1665(1989). It liesapproximately 150 kb centromeric to the t(1;17) breakpoint and showsstrong conservation across species, and subsequent investigationdetailed in Specific Example 2 demonstrates that it does contain thepromoter of the NF1 gene. The remainder of the NF1 gene was cloned bycDNA walking, YAC technology, and examination of genomic upstreamsequences as described in Specific Example 2.

SPECIFIC EXAMPLE 2

I. Isolation of NF1 cDNA Clones

Five different cDNA libraries were used in the cDNA walk. Librariesincluded fetal muscle, fetal brain, adult brain (occipital pole andmedulla), and endothelial cells. The human fetal brain cDNA library,oligo(dt) and random primed, was obtained from Stratagene, La Jolla,Calif. (#936206). The human fetal muscle library, oligo (dT) primed, wasa gift of F. Boyce and is described in Koenig, M. et al., Cell50:509-517 (1987). The adult human occipital pole cDNA library, randomand oligo(dT) primed, was obtained from Clontech, Palo Alto, Calif.(#HL1091a). The adult human medulla library, random and oligo(dT)primed, was obtained from Clontech (#HL1089a). An endothelial celllibrary, random primed, was a gift of D. Ginsburg. Ginsburg, D. et al.,Science 228:1401-1406 (1985). Clones P5, isolated from a peripheralnerve cDNA library, and B3A, from a B-lymphocyte library, were isolatedas previously described herein and in Wallace M. R. et al., Science249:181-186 (1990). Since the NF1 transcript has been shown to beubiquitously expressed (see Buchberg, A. et al., Nature 347:291-294(1990); Wallace, M. R. et al., Science 249:181-186 (1990)), cDNA walkingproceeded in the previously described multiple cDNA libraries in orderto maximize chances of finding positives. Walks proceeded by isolationof positive) phage clones using the most 5′ cDNA insert.

Typically, 500,000 plaques of each library were plated and screened asdescribed in Benton, W. D. et al., Science 196:18-182 (1977), using anaqueous hybridization consisting of 6× SSC, 2× Denhardts solution, 1 mMEDTA and 0.5% SDS at 65° C. Washes were in 2×, 1×, and, if needed, 0.2×SSC, 0.1% SDS at 65° C. The positive clones were characterized byrestriction mapping using EcoRI and Southern blot analysis usingpreviously isolated inserts. The phage clones were subcloned intoBluescript (Stratagene) or rescued as plasmid per λZAP instructions inthe case of the fetal brain library, and the ends were sequenced toanchor the position of the clones to the transcript map. The cycle wasrepeated for each walk. Underrepresented regions in any given librarywere overcome by crossing into another library. The entire transcript asrepresented in the clones was sequenced multiple times and both strandswere sequenced at least once for all previously unpublished sequence.

FIG. 7 is a schematic diagram representing the cDNA walk from the 3′ endof the NF1 gene. The open reading frame is represented by the wideregion bound by the ATG and TGA codons, and the extent of the GAPrelated domain used in complementation studies of Ballester, A. et al.,Cell 63:851-859 (1990) is indicated. Clones are listed below theschematic of the transcript, straight lines represent authentictranscript and jagged lines represent co-cloning events. EcoRI sites arerepresented by vertical lines. Clones B3A and P5 have been previouslydescribed herein and in Wallace, M. R. et al., Science 249:181-186(1990). Clones AE25, KE-2, and GE-2 were isolated from the endothelialcell cDNA library. Ginsburg, D. et al., Science 228:1401-1406 (1985).Clones HF6B, FB50, HF7B, EF3, EF8, FF1, EF2, FF13, and HF3A wereisolated from the fetal brain cDNA library (Stratagene, #936206). CloneCAT2 was isolated from the same library by PCR from total phage lysatefrom the library. See Ballester, R. et al., Cell 63:851-859 (1990).Clone AM20 was isolated from a human brain (medulla) cDNA library(Clontech, #HL1091 a), and clone GM9A was isolated from the fetal musclecDNA library of Koenig, M. et al., Cell 50:509-517 (1987).

As the cDNA walk neared completion, a very GC rich region of thetranscript was encountered at the 5′ end that contained an abnormallyhigh concentration of the dinucleotide CpG, as well as some rare cuttingrestriction endonuclease sites. Some of these sites had been previouslyplaced on the pulsed field map of this region using the NotI linkingclone 17L1. Fountain, J. W. et al., Amer. J. Hum. Genet. 44:58-67(1989). This clone was isolated from a NotI linking library constructedfrom DNA from flow-sorted chromosome 17 (Wallace, M. R. et al., NucleicAcids. Res. 17:1665-1677 (1989)) and contains the sequences flankingboth sides of a genomic NotI site. The telomeric half of the probe wasused to detect a translocation breakpoint within the NF1 gene. Fountain,J. W. et al., Science 244:1085-1087 (1989). Southern blots using the CpGrich cDNAs as probes against 17L1 demonstrated that the most 5′sequences obtained were indeed located in centromeric half of this clone(17L1B), approximately 300 kb from the 3′ stop codon.

The most 5′ clone, KE-2, isolated from the endothelial cell cDNA librarycontained an in frame stop codon as shown in FIG. 17 which depicts thecDNA sequence of the 5′ portion of the NF1 transcript. The sequence inFIG. 17 has not been previously published, ending where previouslypublished sequence began. Wallace, M. R. et al., Science 249:181-186(1990); Cawthon, R. et al., Cell 62:193-201 (1990); Xu, G. et al., Cell63:835-841 (1990). Sequence was compiled from clones KE-2, GE-2, GM9A,EF2, FF13, and HF3A. Both strands were sequenced at least once tocomplete the sequence. The nucleotide and deduced amino acid sequenceare numbered along the left column. The start codon is underlined, andthe upstream in frame stop codon is boxed. The position of theoligonucleotide used for primer extension (FIG. 8) is shown by an arrow.The position of the first intron, is indicated by a triangle, and is theposition where alternate sequences diverge FIG. 18.

Downstream from the stop codon, the first ATG fits the rules for aproper translational start. See Kozak, M., Cell 44:203-292 (1986).Overlapping sequences have been found in cDNA clones from threedifferent tissues (fetal muscle, fetal brain, and endothelial cells). Wepropose that this ATG codon represents the authentic start codon, givingthe protein, a total of 2818 amino acids with a predicted molecularweight of about 327 kilodaltons.

II. Primer Extension

In order to determine whether a substantial portion of the 5′ end ofthis transcript remained undoned, a primer extension was performed usinghuman brain (frontal lobe) and melanoma cell line SK-MEL-23 poly A+ RNA.Total RNA was isolated from fresh human brain (frontal lobe) and amelanoma cell line SK-MEL-23 (Carey, T. E. et al., PNAS (USA)73:3278-3282 (1976)) as described in Sambrook, J. et al., MolecularCloning: A Laboratory Manual 2nd ed., Cold Spring Harbor: Cold SpringHarbor Laboratory (1989). Polyadenylated RNA was isolated from melanomatotal RNA using the FastTrack mRNA isolation kit (Invitrogen Corp., SanDiego, Calif.). For primer extension, an oligomer (5′AGAGGCAAGGAGAGGGTCTGTG) (SEQ ID NO:12) was synthesized, kinased with32P, and extended off of brain (total RNA) melanoma (polyA+RNA,Boorstein, W. R. et al., Primer Extension Analysis of RNA; Meth.Enzymol. 180:347-369, Academic Press, San Diego (1989). The reversetranscription primer chosen was 5′ of the proposed start codon, withinthe proposed exon 1 at the position shown in FIG. 17. Products wereanalyzed on a 6% denaturing polyacrylamide gel. FIG. 8 shows the resultof this analysis.

As shown in FIG. 8, a series of four bands ranging in size from 380-410bp is seen in the extension from brain RNA. A second prominent band of300 bp is also seen. Primer extension from the melanoma RNA shows thetop two bands at 400 and 410 bp, but does not show the lower band at 300bp. Cloning and sequencing was done 119 bp from the 5′ position of thisprimer, indicating that at most only 291 bp remain uncloned. Within thecloned sequence lies the proposed ATG start and upstream in frame stopcodon. These results indicate that the entire coding region of thetranscript has been cloned and sequenced.

III. Sequence Analyze of Clones

Double stranded sequencing of plasmid clones was performed usingSequenase Version 2.0 (U.S. Biochemicals, Cleveland, Ohio) perinstructions. Sequence analysis was performed using the Pustell andCyborg sequence analysis programs. Analysis of the amino acid sequencewas performed with the GCG protein analysis package.

Sequencing of the proximal half of the NotI linking clone 17L1 (17L1B)demonstrated that the 5′ cDNA sequences from nucleotide 1 to 270 existin this region of the genome in a single continuous exon. This indicatesthat exon 1 of this transcript has been cloned and contains a majorityof sequence that is 5′ untranslated. If another exon exists upstream, itwould contain entirely non-coding sequence. The transcriptional startsite and the promoter region probably exist in the half linking clone17L1B.

A. 5′ End Alternate Sequences

In the course of screening for cDNA clones that extended beyond our most5′ clone, two alternate sequences were discovered and are shown in FIG.18. Sequences numbered 1 and 3 are derived from clones isolated from afetal brain cDNA library and were each represented by single isolates,not shown in FIG. 7. Sequence number 2 matches clones from cDNAlibraries from three different tissues, and contains the 5′ end shown inFIG. 17. The position of the first splice junction is shown by a thickline separating exon 2 to the right and the three different exon 1sequences to the left. Sequence number 1 has been shown by PCR to beunspliced message, and the consensus AG and lariat sequences areunderlined. Sequence number 2 represent the most prominent species ofmRNA, and has a proper consensus start codon. Kozak, M., Cell 44:283-292(1986). Sequence number 3 has an in frame stop codon which is boxed.

Both alternate sequences begin at position 270, the position of thefirst intron-exon border, and are represented by single cDNA clones froma fetal brain cDNA library. Sequence 1 is an unspliced message, as itcontains a perfect splice junction acceptor consensus sequence, apyrimidine stretch of 20 bases, and a lariat formation consensussequence. Sharp, P. A., Science 235:766-771 (1987). This was confirmedby designing primers that extend across the splice junction and showingthat these primers will amplify the correct sized fragment using genomicDNA. If this sequence does represent an authentic mRNA, it contains anin frame stop codon just ten triplets upstream from the splice junction.

A second unusual clone diverges at the exact same position, yet has adifferent sequence, sequence 3, and thus must be an alternative spliceproduct at the 5′ end. It contains an in frame stop codon only 15triplets upstream from the point of divergence, without a methioninecodon in the new sequence. This clone is also unusual in that incontains an Alu repeat at one end, and may be only partially spliced.However, if it does represent an authentic 5′ end sequence, it wouldcode for a smaller protein product, possibly beginning translation atthe next ATG 58 codons downstream from the splice junction.

B. 3′ End of Sequence

Characterization of the 3′ end of the NF1 transcript has not beencompleted, as a poly A tail has not been found in any cDNA clone.Previous sequence analysis has shown the proper position of the stopcodon. See Wallace, M. R. et al., Science 249:181-186 (1990); Xu, G. etal., Cell 62:599-608 (1990). Downstream from this the sequence is very Arich, with some regions that may be capable of priming with oligo dTduring construction of the cDNA libraries. The NF1 transcript has beenestimated to be 13 kb by its migration on a Northern blot. Wallace, M.R. et al., Science 249:181-186 (1990). To date we have cloned andsequenced over 9 kb of this message. The primer extension results (FIG.8) indicate that the majority of uncloned sequence from this transcriptarise from a very long (approximately 4 kb) 3′ untranslated region.Alternatively, our estimates of transcript size may be incorrect, assize estimates in this range of Northern blotting are difficult.

At least two other alternate processed forms of this primary transcripthave been discovered. A 54 bp insertion coding for an additional 18amino acids (ASLPCSNSAVFMOLFPHQ) near the 3′ end of the transcript hasbeen previously described Cawthon, R. et al., Cell 62:19201 (1990); Xu,G. et al., Cell 62:599-608 (1990); Cawthon, R. et al., Genomics7:555-565 (1990). A 63 bp insertion coding for an additional 21 aminoacids (ATCHSLLNKATVKEKKENKKS) within one of the most conserved regionsof the GAP related domain has been discovered.

C. Complete Sequence

FIG. 19 shows the complete amino acid sequence of the primary NF1transcript deduced from the open reading frame of sequenced clones fromthe cDNA walk. Boxed areas indicate the 4 blocks of homology mostconserved between the GAP family of proteins. Wang, Y. et al., CellRegulation 2:453-465 (1991). The positions of the alternatively splicedexons and their sequence is shown. There are no SH2 or SH3 domains (srchomology domains), which are present in GAP. The protein shows noapparent membrane spanning region, and is predicted to be cytosolic bydiscriminant analysis. Klein, P. et al., Biochim. Biophys. Acta815:468-476 (1985). A potential leucine zipper is present beginning atamino acid residue 1834, but this region is not predicted to be in analpha-helical conformation due to the presence of a proline in themiddle of the repeat. Six potential cAMP-dependent protein kinasephosphorylation sites and one potential tyrosine phosphorylation siteare present. The sequence shows no significant homology to the recentlydescribed Bcr related GAP family which includes n-chimaerin, and GAPrho(Diekmann, D. et al., Nature 351:400-402 (1991)) and possibly the p85 ofbovine brain phosphatidylinositol 3-kinase. Otsu, M. et al., Cell65:91-104 (1991).

Referring again to FIG. 19, the boxed regions in Table 4 correspond tothe most statistically significant regions of similarity among the GAPfamily of proteins with the invariant residues marked with stars.Residues underlined with a thin line are the potential cAMP-dependentprotein kinase recognition sites. Residues that are double underlinedrepresent the potential tyrosine phosphorylation recognition sequence.The position of a 21 amino acid insertion representing an alternativelyspliced product is shown with a dark triangle. The position of an 18amino acid insertion (ASLPCSNSAVFFWLFPHQ) (SEQ ID NO:13) representing analternatively spliced product is shown by an open triangle. Xu, G. etal., Cell 62:599-608 (1990).

We found three regions of our nucleotide sequence that were at variancewith previously published sequence (Xu, G. et al., Cell 62:599-608(1990)), two resulting in changes in the amino acid sequence. Residuenumber 496 in our clones shows an ATG methionine codon rather than anATA isoleucine codon. Another sequence variation at residue 1183 showsan CTG leucine codon rather than the previously published CTC. Ourclones also lacked an extra CAT histidine codon after residue number1555. The latter two changes that we have noted agree with those ofMartin, G. A. et al., Cell 63:843-849 (1990), from their sequence of aPCR clone of the GAP-related domain region.

D. YAC Mapping

The size of the NF1 gene has been determined by mapping cDNA clones backto a yeast artificial chromosome (YAC) contig that spans over 600 kbsurrounding the gene. The 5′ end is just centromeric to the NotI sitewithin the linking clone i 7L1B, (Fountain, J. W. et al., Amer. J. Hum.Genet 44:58-67 (1989); Fountain, J. W. et al., Science 244:1085-1087(1989); Wallace, M. R. et al., Nuclei Acids Res. 17:1665-1667 (1989)),the first intron beginning only 81 basepairs from the NotI site. Itextends through the position of a t(1;17) translocation breakpoint andbeyond the position of a t(17:22) translocation breakpoint. Fountain, J.W. et al., Science 244:1085-1087 (1989); O'Connell, P. et al., Science244:1087-1088 (1989). The transcript extends toward the telomere aminimum of 270 kb, beyond the NruI site. This site is seen only in theYAC clones, which are unmethylated, and thus must be methylated ingenomic DNA. The most 3′ clone isolated does not extend beyond the MIuIsite (also present only in the YAC clones), and defines a maximum genesize of 310 kb. The gene therefore extends between 270 and 310 kb. Thisassumes that the remainder of the 3′ untranslated region yet unclonedexists in a single exon. All of the intron-exon borders of the gene havenot yet been characterized, but by the number of bands on a genomicSouthern blot it is estimated that it would contain in excess of 30exons. The three previously described embedded genes (EVI2A, EVI2B, andOMgp) are transcribed from the opposite strand and are contained withina single intron. See Cawthon, R. et al., Cell 62:193 201, (1990);Cawthon, R. et al, Genomics 7:555-565 (1990); O'Connell, P. et al.,Science 244:1087-1088 (1990); Mikol, D. D. et al., J. Cell. Biol.110:471-479 (1990); Xu, G. et al., Cell 62:599-608 (1990); Xu, G. etal., Cell 63:835-841 (1990); Viskochil, D. et al., Mol. Cell Biol. 11,in press (1991).

FIG. 9 details the data generated by mapping the cDNA clones against theYAC contig. The code for restriction sites is N for NotI, U for NruI andM for MIuI. The boxed sites represent the positions of undermethylatedCpG islands in genomic DNA.

E. Discussion of NF1 Gene Product: Tumor Suppressor Model

Initial partial sequences of the 3′ end of the NF1 gene revealed littlein the way of sequence homologies that could provide a clue to thefunction of the gene product. Cawthon, R. et al., Cell 62:193-201(1990); Cawthon, R. et al., Genomics 7:555-565 (1990); Wallace, M. R. etal., Science 249:181-186 (1990). Further cloning and sequence analysisrevealed homology to the mammalian GAP protein and yeast IRA1 and IRA2gene products, which modulate the activity of the p21 ras protein intheir respective hosts by accelerating the rate at which ras hydrolyzesGTP to become inactive ras-GDP. Ballester, R. et al., Cell 63:851-859(1990); Buchberg, A. et al., Nature 347:291-294 (1990); Xu, G. et al.,Cell 62:599-608 (1990); Xu, G. et al., Cell 63:835-841 (1990). Thisprovided the first glimpse of the function of NF1; like GAP, it may bean upstream regulatory protein for ras (or a ras-related protein) withits normal function being to down-regulate a member of the ras familyinvolved in mitogenic signal transduction. This model received furthersupport when it was shown that the proposed GAP related domain of NF1could complement loss of IRA function in yeast, and that it couldstimulate ras-GTPase activity in vivo and in vitro. Ballester, R. M. etal., Cell 63:851-859 (1990); Martin, G. A. et al., Cell 63:835-841(1990); Xu, G. et al., Cell 63:835-841 (1990).

An alternate model of ras-NF1 interaction postulates that NF1 mayinstead be a downstream effector for ras; a downstream model has alsobeen proposed for the related GAP. Adari, H. et al., Science 240:518-521(1988); Cales, C. et al., Nature 332:548-551 (1988); Hall, A., Cell61:921-923 (1990); Yatani, A. et al., Cell 61:769-776 (1990). In thismodel the NF1 protein would be a target of activated ras, as adownstream member in the signal transduction cascade. Support for thismodel relies on the following observation. Both GAP and NF1 are thoughtto interact with ras p21 through its effector domain. Mutations in theputative ras effector domain inactivate the transforming ability of rasand prevent the GAP related domain of NF1 from GTPase activation(Martin, G. A et al., Cell 63:843-849 (1990); Xu, G. et al., Cell,835-841 (1990)), however, these studies require repetition with the fulllength protein. It also should be cautioned that recent work using atemperature sensitive effector domain mutant of ras has showndissociation between stimulation of GTPase activity by GAP and the NF1GAP domain and their role as ras effectors., DeClue, J. E. et al., Mol.Cell. Biol. 11:3132-3128 (1991). This result is consistent with a raseffector molecule distinct from NF1 or GAP.

Nonetheless, the downstream effector model is attractive for its abilityto account for the role of ras in certain cells of neuroectodermalorigin. In the rat pheochromocytoma cell line PC12, activated rasinduces differentiation and blocks proliferation. Bar-Sagi, D. et al.,Cell 42:841-848 (1985); Noda, M. et al., Nature 318:73-75 (1985).Inhibition of ras in these cells blocks neural differentiation normallyinduced by nerve growth factor. Hagag, N. et al., Nature 319:680-682(1986); Szeberenyi, J. et al., Mol. Cell. Biol. 10:5324-5332 (1990).Activated ras has also been shown to induce cell cycle arrest whenintroduced into rat Schwann cells Ridley, A. J. et al., EMBO J.7:1635-1645 (1988). This is significant because Schwann cells may be theoriginal cells that recruit other cell types in the formation ofneurofibromas and neurofibrosarcomas. Ratner, N. et al., Ann. Neurol.27:496-501 (1990); Sheela, S. et al., J. Cell. Biol. 111:645-653 (1990).Therefore, if NF1 is the the target (effector) of ras, then loss of NF1function in Schwann cells could lead to a block of normaldifferentiation resulting in uncontrolled proliferation.

Either of these two models, the upstream regulatory protein ordownstream effector model, are consistent with the NF1 being a tumorsuppressor gene, with the phenotype being the result of loss of bothalleles of the gene. Knudson, A. G., Cancer Res. 45:1437-1443 (1985).Previous studies of NF tumors did not always show a consistent loss ofheterozygosity for NF1 in 17q11.2. Skuse. G. R. et al., Genes Chrom.Cancer 1:36-41 (1989); Menon, A. G. et al., PNAS (USA) 87:5435-5439(1990); Glover, T. W. et al., Genes. Chrom Cancer 3:62-70 (1991).However, these interpretations were made difficult by frequent losses on17p, apparently reflecting the major role that loss of the p53 geneplays in tumor progression in this disorder. Nigro, J. M. et al., Nature342:705-708 (1989). More recent analyses have indicated that loss ofheterozygosity involving only 17q can be demonstrated at least for sometumors. Skuse, G. R., Science 250:1749 (1990); Legius and Glover,personal communication; Ponder, personal communication. In those tumorsnot showing loss of heterozygosity, it is likely that the mutation inthe second allele could be an independently acquired mutation.

The evidence that the NF1 gene is a tumor suppressor gene isstrengthened by the unusual chromosomal aberration found in a sporadiccase of NF1 described by Andersen, L. B. et al., Cytogenet. Cell Genet.53:206-210 (1990). This patient shows the formation of a minichromosomeby excision of the proximal region of 17q, which is lost in about 6% ofsomatic cells grown in culture. With one allele lost, this fraction ofthe somatic cells effectively become hemizygous for the NF1 gene.Subsequent somatic mutations can account for the second mutation anddevelopment of neurofibromas.

It is also possible that the second hit mutation may not be within theNF1 gene. Mutations within some other gene involved in the pathwaytoward development of benign neurofibromas may be in another as yetundefined locus. This would require that a gene dosage of one half besufficient to cause the formation of neurofibromas. In this scenario themalignant neurofibrosarcomas may require both NF1 alleles to be mutated.

The complete sequencing of the NF1 protein product illustrates the lackof SH2 and SH3 domains in contrast to GAP. Homologous to non-catalyticregions of the oncogene src, these domains are thought to directinteractions with phosphotyrosine proteins involved in signaltransduction. Koch, C. A. et al., Science 252:668-674 (1991). Theirabsence in NF1 implies that NF1 and GAP are not interchangeable in thecell, and that NF1 is probably not directly modulated through tyrosinephosphorylation by activated growth factor receptors. The potentialsites for tyrosine and serine/threonine phosphorylation imply thatintermediates activated receptors and NF1 may be involved in modulationof NF1 activity, since there is evidence that the NF1 protein isphosphorylated on serine and threonine (Downward, personalcommunication).

A potential candidate for this intermediate could be one of the membersof the ERK family, which are activated by tyrosine phosphorylation bynerve growth factor and are themselves serine/threonine protein kinases.Boulton, T. G. et al., Cell 65:663-675 (1991). Certainly, the large sizeof the product in relation to the small portion conferring GAP activityindicates that other domains may be involved in modulating theras-GTPase activity of this protein, or carrying out entirely differentfunctions. The fact that sequence homology with yeast IRA1 and IRA2extends beyond the GAP catalytic domain toward both termini indicatesthat there may be more extensive functional homology with these membersof the GAP family. Statistical analysis has also shown NF1 to be morecloser related to IRA2 than IRA1. It may be useful to think of the NF1product as consisting of three domains: an amino-terminus of unknownfunction, a GAP-related middle domain, and a carboxy-terminus related tothe IRA1 and IRA2 gene products. Unfortunately, the functions of thesedomains of the IRA gene products are not known even in this simpleeukaryote. For NF1, interaction with other factors in the pathwayleading to differentiation, such as the low-affinity nerve growth factorreceptor and the trk oncogene product, or factors that mediate betweenthese and NF1, may be localized to these yet undefined regions.Interactions at these other domains may also ultimately provide a clueas to the reasons why a mutation in a ubiquitously expressed genereveals its character predominantly in cells derived from the embryonicneural crest.

Although it is presently unclear how the alternatively splicedtranscripts play their unique role in NF1 function, in some tissuestheir expression is not mutually exclusive. These alternative forms mayplay a role in the diverse clinical manifestations of the disorder.Germline mutations in some of the alternatively spliced exons may giverise to some of the more unusual NF1 phenotypes.

The large NF1 transcript and 300 kb gene size represent a large targetfor mutations. Assuming that the NF1 phenotype results from loss offunction mutations, causative mutations may be dispersed throughout thecoding and regulatory region. In the patients surveyed thus far, thisseems to be the case. Cawthon, R. et al. Cell 62:193-201 (1990);Viskochil, et al., personal communication; The size of the gene alonehowever, cannot fully account for the high mutation rate. At best, thepossible target size is only a factor of 10 larger than other genes,whereas the mutation rate is about 100 fold higher than what is usualfor a single locus, with the majority of new mutations of paternalorigin. Jadayel, D. et al., Nature 343:558-559 (1990). It remains to bedetermined whether paternal germline DNA is more mutable due tomethylation patterns associated with genomic imprinting.

SPECIFIC EXAMPLE 3

I. Materials and Mob

A. Peptides and Fusion Proteins

Peptides D (CQVQKQRSAGSFKRNSIKKIV; residues 2798-2818 of SEQ ID NO:2)and H (CNPRKQGPETQGSTAELIG; residues 509-528of SEQ ID NO:2) weresynthesized and conjugated to keyhole limpet hemocyanin at a couplingratio of 13:1. Each peptide synthesized was verified by analysis of theamino acid composition. Fusion proteins were produced by subcloning theappropriate cDNA fragment into the pMAL.c vector (New England BioLabs)so that the reading frame was maintained. pMAL.B3A is a 219 ntPstI-HindII fragment (residues 2746-2819) of the NF1 cDNA containing a73 amino acid open reading frame and the natural termination codon(Marchuk, D. A. et al, manuscript submitted). pMAL.HF3A.P is a 918 ntHpaI-PstI fragment (residues 65-371) while pMAL.HF3A.X is a 3523 ntHpaI-XhoI fragment (residues 65-1240) of the NF1 cDNA. The fusionprotein constructs were verified by restriction enzyme analysis beforetransformation into host BL21 (DE3) cells. These cells were grown to anOD₆₀₀ of 0.6 in Luria broth and induced in 0.4 mMisopropyl-b-D-thiogalactoside (IPTG) for 3 hours. Cells were solubilizedin Laemmli buffer by boiling for 10 minutes and analyzed on SDSpolyacrylamide gels. Studier, F. W. et al., Meth. Enzymol. 185:60-69(1990).

B. Antisera

Female rabbits (4 kg) were initially injected with either 500 microgramsof purified peptide or 250 micrograms of fusion protein embedded inemulsified polyacrylamide gel slices in complete Freund's adjuvant andboosted every four to six weeks with the same amount of antigen inincomplete Freund's adjuvant. Preimmune and immune sera were collectedby ear venupuncture, allowed to clot and centrifuged at 2500 rpm for 15minutes. Thimerisol was added as a preservative at a final concentrationof 0.01% and the antisera was stored at −20° C.

C. Radiolabeling and Cell Lysis

2×10⁶ cells were incubated overnight in growth media and then labeledwith ³⁵S-methionine by incubating cells in 2 ml methionine-free mediacontaining 0.25 mCuries of labeling-grade ³⁵S-methionine (Amersham) permilliliter media overnight. Cells were washed once in phosphate-bufferedsaline, pH 8 and lysed in modified RIPA buffer (150 mM NaCl, 50 mMTris-HCl, pH7.5, 1% Nonidet P-40, 0.25% sodium deoxycholate, 2 mM EGTA,1 mM phenylmethanesulfonyl fluoride, 1 mM leupeptin and 0.1 mMapronitin) for 20-30 minutes at 4° C. Lysates were then clarified bycentrifugation at 14,000 rpm for 10-15 minutes. Harlow, E. et al.,Antibodies: A Laboratory Manual, Cold Spring Harbor, N.Y. pp. 421-470(1988).

Whole cell lysates were prepared in a similar fashion. Cells were lysedin modified RIPA buffer at a density of 10⁷ cells/ml lysis buffer at 4°C. for 20 minutes and then clarified. Mouse tissue homogenates wereprepared from freshly harvested organs by extensive homogenization usinga Dounce tissue grinder and repeated passage through a 20 gauge needleprior to lysis in modified RIPA buffer at 4° C. for 20 minutes. Lysateswere then clarified by centrifugation at 14,000 rpm for 10-15 minutes.The amount of protein was determined by Biorad Protein Assay accordingto the protocol specified by the manufacturer.

D. ELISA and Immunoprecipitation

Enzyme-linked immunosorbent assays (ELISA) were performed usinghorseradish peroxidase-conjugated goat anti-rabbit antibodies (BethesdaResearch Laboratories, Bethesda, Md.) as per the manufacturer'sspecifications. All rabbit sera were tested by ELISA against theimmunizing antigen, KLH and an irrelevant antigen. ELISA plates werecoated overnight with antigens, KLH, and IgG positive controls prior toblocking with 2% bovine serum albumin (Sigma) in PBS for 2 hours.Antisera was added over a dilution range of 1:50 to 1:80,000 for 90minutes. Goat anti-rabbit horseradish peroxidase (HRP) conjugate (IgGheavy and light; Bethesda Research Laboratories) was added for 1 hourafter washing in PBS. The ELISA was developed with o-phenylene-diamine(Zymed) in citrate buffer (0.05 M citric acid, 0.1 M NaH₂PO₄, pH 5.0)and 0.03% hydrogen peroxide. Results were analyzed on a Dynatech MR65096 well plate reader at an optical density of 490 nm.

Cells were lysed as described in the previous section, cleared bycentrifugation in a microfuge at 14,000 g for 15 minutes and thenincubated with rabbit antisera for 2 hours at 4 degrees Centigrade.Sepharose protein A beads (Pharmacia CL4B) were added and incubated foran additional 2 hours with regular agitation. Immunoprecipitates werewashed twice in buffer 1 (10 mM NaPO₄, pH 7.6; 100 mM NaCl; 1 mM EDTA,1% Triton X-100; 0.5% sodium deoxycholate; 0.1% SDS), twice in buffer 2(20 mM Tris-HCl, pH 8.3; 250 mM NaCl; 1% Nonidet P40; 0.1% SDS) and thentwice more in buffer 1 before Laemmli buffer was added. The samples werethen boiled for 5 minutes and loaded onto SDS polyacrylamide gels foranalysis. Harlow, E. et al., Antibodies: A Laboratory Manual, ColdSpring Harbor, N.Y., pp. 421-470 (1988).

E. L SDS-PAGE and Western Blot Analysis

Samples were separated by standard SDS-polyacrylamide gelelectrophoresis (Loemmli, U. K, Nature 227:680-685 (1970)) andtransferred overnight in modified Towbin transfer buffer (50 mMTris-HCl, 380 mM glycine; 18) with 10% methanol at 250-750 milliamps.Nitrocellulose (Gelman, Ann Arbor, Mich.) or polyvinylene fluoride(PVDF) Immobilon (Millipore) membranes were used for Western transfer asper the manufacturer's specifications. Filters were washed in PBS andstained with Ponceau-S.

Western blot analysis was performed using either alkaline phosphatase(AP) or horseradish peroxidase (HRP) conjugated secondary antibodies asper the manufacturer's recommendations. Filters were blocked overnightin 5% ovalbumin-TBST prior to Western blot analysis. All antibodies werediluted in 2% ovalbumin (Sigma, St. Louis, Mo., grade IV) with TBST (20mM Tris-HCl, pH 7.5, 150 mM NaCl, 0.05% Tween). Membranes were incubatedwith anti-peptide sera for 1-2 hours at room temperature, washed fourtimes in TBST, incubated with the appropriate secondary AP or HRPconjugated antibodies for 1 hour at room temperature and then washedfour times in TBST. Development was performed using5-bromo-4-chloro-3-indolyl phosphate with nitroblue tetrazolium (APmethod: Sigma) or 4-chloro-1-naphthol (Sigma) and 0.03% hydrogenperoxide (HRP method).

F. Purification of Antibodies

Three techniques have been used in our laboratory to purify mouse orrabbit antibodies from whole sera and can be applied to purify theantisera described above. The first method involves the use of a ProteinA-Sepharose column. This column will bind immunoglobulins ofpredominantly the IgG class which can then be eluted from the columnusing 0.1 M glycine pH 23. After Protein A column purificationantibodies are predominantly of the IgG class and may be furtherpurified by affinity purification. The second technique involvesaffinity purification of antibodies through the use of apeptide-Sepharose column. The peptide originally used as an immunogen inrabbits or mice is coupled to activated Sepharose beads and extensivelywashed. Sera from rabbits or mice is applied to these columns and elutedas before. The resulting eluate should contain only antibodies whichreact to the peptide used as an immunogen. Alternatively, when fusionproteins are used as immnogens, the immunizing fusion protein isseparated from total E. coli cell proteins by SDS-polyacrylamide gelelectrophoresis prior to transfer to nitrocellulose membranes. Thefusion protein is visualized on the nitrocellulose membrane by Ponceau-3staining and then cut out for immunoadsorption. Antibodies directedagainst the fusion protein are eluted in glycine buffer after adsorptionto these nitrocellulose strips. The resulting eluate should contain onlyantibodies which recognize the fusion protein immunogen.

II. Results

A. Fusion Proteins

Fusion proteins generated using the pMAL.c expression vector wereanalyzed by SDS-PAGE to determine whether the desired fusion protein wasexpressed. Cell extracts were prepared as described in the Materials andMethods section and separated by 7.5% SDS-PAGE. Fusion proteins weredetected by Coomassie Blue staining. Overexpression of pMAL.HF3A.X,pMAL.HF3A.P, and pMAL.B3A is shown in FIG. 11: (A) 25 microliters ofpMAL vector alone expressing the maltose-binding protein (B) 25microliters of pMAL.B3A, (C) 50 microliters of pMAL.HF3A.P, and (D) 50microliters of pMAL.HF3A.X fusion proteins are overexpressed with yieldsranging between 0.25 to 2.0 mg/ml. Molecular weight size standards wererun in the first lane. Fusion proteins are denoted by the arrows. Theyield of fusion protein per microliter of bacterial cell lysate varieddepending on the size of the fusion protein. PMAL.c vector alone andpMAL.B3A typically yielded 1-2 mg/ml fusion protein whereas thepMAL.HF3A series yielded 0.25-0.5 mg/ml fusion protein.

B. Antisera Recognize a 250 kDa Protein

The antisera raised in rabbits against peptides D and H as well as thepMAL.B3A fusion protein Identified a unique protein which migrates as a250 kDa species. Immunoprecipitation, with the results depicted in FIG.12(A), was conducted as follows: HeLa cells were radiolabeled with³⁵S-methionine overnight and lysates were immunoprecipitated with H andD peptide antisera. 500 microliters of lysate was incubated with 8microliters of sera prior to the addition of 35-40 microliters of 50%Protein-A Sepharose CL-4B beads. Precipitates were separated by 7.5%SDS-PAGE, fixed in 25% isopropanol-10% acetic acid, dried down andexposed on Kodak film at −70° C. for 3 days. The lanes in FIG. 12(A)contained the following: lane 1, preimmune D peptide serum; lane 2,immune D peptide serum; lane 3, preimmune H peptide serum; lane 4,immune H peptide serum. Molecular size standards are indicated on theleft margin.

The Western blot shown in FIG. 12(B) was prepared as follows: HeLa cellswere lysed, separated by 7.5% SDS-PAGE and transferred to PVDF membranesprior to Western blot analysis using H peptide and pMAL.B3A fusionprotein sera at a 1:1000 dilution. Alkaline phosphatase conjugated goatanti-rabbit antibodies (Cappell) were used at a 1:3000 dilution. Thelanes in FIG. 12(B) contained the following: lane 1, preimmune H peptideserum; lane 2, immune H peptide serum; lane 3, preimmune pMAL.B3A fusionprotein serum; lane 4, immune pMAL.B3A fusion protein serum. The arrowspoint to the 250 kDa NF1GRP. Molecular size standards are indicated onthe left margin.

As shown in FIG. 12, immune H peptide antisera detects a 250 kDa proteinby immunoprecipitation of the ³⁵S-methionine radiolabeled tissue culturecells (FIG. 12A, lane 4) as well as by Western immunoblotting againstthe whole cell lysates transferred onto Immobilon or nitrocellulosefilters (FIG. 12B, lane 2). There also appear to be other protein bandswhich are not consistently seen. At this time, we cannot exclude thepossibility that these other proteins represent degradation products ofthe protein, alternative forms of the protein, or co-migrating species.The fact that incubation of the H and D peptide antisera with theirrespective peptides eliminates immunoprecipitation and Western blotdetection of the 250 kDa protein but not the other protein bands supportthe conclusion that these additional proteins are not related to the NF1gene product.

Antisera raised against the D peptide likewise detect a unique proteinmigrating at 250 kDa by immunoprecipitation of radiolabeled cells (FIG.12A, lane2). However, Western immunoblotting using this antisera doesnot reproducibly identify unique proteins. Antisera raised against thepMAL.B3A fusion protein detects a unique protein migrating at 250 kDa byWestern blotting (FIG. 126, lane 4). This 250 kDa protein has beenidentified in all tissues examined, including brain, Schwann cells,fibroblasts, lymphocytes, hepatocytes, erythroleukemia cell lines, andmelanoma cell lines. It has also been identified in human, mouse and ratcells. No cell line tested to date has failed to express this protein.There also does not appear to be major quantitative variation betweenthe various cell lines examined.

C. Antisera Recognize Independently Generated Fusion Proteins

In order to demonstrate that the protein identified actually representsthe NF1 gene product, we tested the antisera described above againstfusion proteins which contain the amino acid sequence used to synthesizethe peptides. Three fusion proteins were generated by ligating a portionof the NF1 cDNA into the fusion protein vector, pMAL.c. As shown in thefusion protein and synthetic peptide map in FIG. 10, two of thesefusions (pMAL.HF3A) represent a discriminatory set, one which containsthe H peptide epitope (pMAL.HF3A.X) and another which does not(pMAL.HF3A.P) the catalytic domain is defined as the 412 amino acidsbetween residues 1125 and 1537. The other fusion protein, pMAL.B3A,contains the D peptide epitope.

As can be seen in FIG. 10, the D peptide sera recognizes the pMAL.B3Afusion protein. BL21 (DE3) cells containing the pMAL.B3A plasmid wereinduced in IPTG for 3 hours and lysates prepared. 50 microliters ofwhole cell lysate were added in each lane and separated by 7.5% SDS-PAGEprior to transfer to PVDF membranes. Western blot analysis was performedusing 1:5000 dilutions of the D antisera and developed using alkalinephosphatase method. Immune (lane 2) but not preimmune (lane 1) D peptideserum recognizes the pMAL.B3A fusion protein. Migration of the fusionprotein is indicated on the right panel as detected by Ponceau-Sstaining (arrow).

Likewise, H peptide antisera recognizes the pMAL.HF3A.X fusion protein,but not the pMAL.HF3A.P fusion protein which lacks the H peptide epitopeby Western blotting (data not shown).

D. Antisera Recognize the Same 250 kDa Protein

In order to confirm that the protein recognized by the D peptide, Hpeptide and pMAL.B3A fusion protein antisera represented the sameprotein, mouse brain lysates were immunoprecipitated using H peptideantisera and transferred to Immobilon for Western blot analysis usingthe pMAL.B3A fusion protein antisera. As can be seen in FIG. 14, theimmune pMAL.B3A fusion protein antiserum recognized the same proteinspecies precipitated by the H peptide antisera and both H peptide andpMAL.B3A fusion protein antisera recognize the same protein precipitatedwith the D peptide antiserum. Mouse brain lysates wereimmunoprecipitated with immune D peptide sera (1:60 dilution ofantisera), separated by 7.5% SDS-PAGE and visualized by Ponceau-Sstaining after transfer to PVDF membranes. The immune H peptide (lane 1)and pMAL.B3A fusion protein (lane 2) sera (1:1000 dilution) detected the250 kDa protein seen in precipitates by Western blot analysis using thealkaline phosphatase detection method. The preimmune serum did notrecognize the 250 kDa protein (lane 3). Preincubation of H peptideantiserum with 10 micrograms of purified H peptide for 2 hours at 4° C.inhibited the recognition of the immunoprecipitated protein (lane 4).Molecular size standards are indicated on the left margin.

E. NF1 Protein Expressed in Adult Mouse Tissues

The expression of the NF1 protein in adult mouse tissues wasinvestigated by homogenizing a survey of tissues from freshly euthanizedadult female mice. Equivalent amounts (100 micrograms) of total tissueextract were separated by electrophoresis on 7.5% SDS-polyacrylamidegels, transferred to PVDF membranes and stained with Ponceau-S to verifyequivalent loading of total protein prior to Western blotting. Using thepMAL.B3A fusion protein antiserum (1:100 dilution), the 250 kDA NF1protein was detected in brain, lung, spleen, kidney, muscle, and colonas indicated at the arrows in FIG. 15. Molecular size is indicated onthe left margin. Similar results were obtained using the H peptideantiserum. Although differences between the various tissues examined byWestern blotting are evident, there are minor variations in quantitiesdetected from experiment to experiment which preclude quantitativecomparisons.

F. Discussion of NF1 Gene Product

Although von Recklinghausen neurofibromatosis (NF1) is a disorderinvolving neural crest-derived tissues, NF1 mRNA appears to beubiquitously expressed. Wallace, M. R. et al., Science 249:181-186(1990). The level of tissue specificity might reflect eithertissue-specific expression or tissue-specific interactions of the NF1gene product with other signal transduction proteins. It is possiblethat the levels of the NF1 protein are significantly greater in neuralcrest-derived tissues or that it is post-translationally modified(phosphorylation, myristylation, etc.) in a tissue-specific manner.Conversely, the tissue specificity may reflect the association of theNF1 protein with other signal transduction molecules which are expressedpredominantly in the neural crest. Possible candidates include nervegrowth factor receptor (Klein, R. et al., Cell 65:189-197 (1991)), thetrk protooncogene protein (Hempstead, B. L. et al., Nature 350:378-683(1991)), ERK1 (Boulton, T. G. et al., Cell 65:663-675 (1991)), and AP2(Mitchell, P. J. et al., Genes and Development 5:105-119 (1991)).

The NF1 protein, like the mRNA, appears to be ubiquitously expressed anddisplays species conservation in mouse and rat. It is detectable byWestern blotting using two different antisera in brain, lung, kidney,liver, spleen, muscle and colon. These findings are in completeagreement with Northern blot data demonstrating NF1 mRNA in mouse liver,kidney and brain. Buchberg, A. M. et al., Nature 347:291-294 (1990). Theprotein is very hydrophilic, based on computer analysis of its predictedamino acid sequence and most likely resides in the cytoplasm. Theprotein also has a relatively slow turnover rate, in that metaboliclabeling for 4-8 hours has failed to detect appreciable quantities ofradiolabeled protein by immunoprecipitation, whereas labeling for 12-18hours detects the protein.

The open reading frame of the NF1 cDNA predicts a protein of 2818 aminoacids and a molecular weight of 327 kDa. Marchuk, D. A. et al.,manuscript submitted. The size discrepancy observed may be the result ofanomalous migration as seen in the cystic fibrosis transmembraneconductance regulator protein which migrates as a 140 kDa proteindespite the predicted 185 kDa size. Alternatively, the difference couldreflect post-translational modifications, such as processing of apro-protein species. This would necessarily involve cleavage of aminoterminal sequences, as the D peptide and pMAL.B3A fusion proteinantisera which recognize carboxy terminal residues identify the same 250kDa protein as the more amino terminal H peptide antiserum. Althoughevidence exists for alternative splicing in the amino terminal portionof the NF1 mRNA, the demonstration that these alternative forms areactually translated into unique protein species awaits more directedexperiments. The fact that the H and D peptide antisera recognizeindependently generated fusion proteins with overlapping epitopes andrecognize the same protein by immunoprecipitation and Western blottingis strong evidence that the protein identified is the authentic NF1 geneproduct. Furthermore, using antisera directed against catalytic domainepitopes. it has been demonstrated that the NF1 protein identified byour antisera is identical to the protein product later identified byothers. As previously discussed, the homology between a small portion ofthe NF1 gene product and the catalytic domain of a family of proteinswith GTPase activity suggests that the NF1 gene product may alsointeract with ras. Tanaka, K et al., Cell 60:803-807 (1990); Hall, A.,Cell 61:921-923 (1990). Previous experiments have demonstrated that theNF1 catalytic domain (residues 1125-1537. of the full length cDNApredicted amino acid sequence) is able to replace yeast IRA1 and IRA2 inrestoring the wild type phenotype in ira1⁻, ira2⁻ yeast strains as wellas catalyze the conversion of wild-type, but not mutant, ras-GTP toras-GDP. Ballester, R. M. et al., Cell 63: 851-859 (1990); Xu, G. etal., Cell 63:835-841 (1990).

It is proposed that future reference to the NF1 protein be to NF1-GAP-related protein (NF1 GRP) to underscore what is known about theprotein and to avoid confusion with the NF1 transcriptional factor, theneurofilament proteins and the neurofibrillary tangles of Alzheimersdisease. It will also be appreciated that, within the scope of theinvention claimed herein, the term gene product is meant to include bothunmodified translated forms and any post-translationally modified forms(e.g., glycosylated, phosphonylated, cleaved, etc.) of the NF1 GRPprotein.

APPLICATIONS

As previously discussed, NF1 is a disease of high frequency and highmutation rate. Screening for NF1, particularly in neonates and youngchildren who are often asymptomatic, is thus one of the majorapplications of the present invention. Since the NF1 gene isubiquitously expressed, test samples of the subject can be obtained froma variety of tissues or blood. An NF1 test can also be included inpanels of prenatal tests since NF1 DNA, RNA or protein can also be assedin amniotic fluid. It will be appreciated that, since NF1 is a dominantdisorder, individuals which are heterozygous for NF1 may still expressNF1 at 50% or reduced levels. Quantitative testing for NF1 transcriptand gene product is thus also contemplated within the scope of thepresent invention.

Nucleic acid and protein-based methods for screening and diagnosing NF1are all contemplated to be within the scope of the present invention.For example, knowing the sequence of the NF1 gene, DNA or RNA probes canbe constructed and used to detect NF1 mutations through hybridizationwith genomic DNA in tissue or blood using conventional techniques. RNAor cDNA probes can be similarly probed to screen for NF1 mutations orfor quantitative changes in expression. A mixture of different probes,i.e. “probe cocktail”, can also be employed to test for more than onemutation.

With respect to nucleic acid-based testing, genomic DNA may be useddirectly for detection of specific sequence or may be amplifiedenzymatically in vitro by using PCR (Salki, et al., Science230:1350-1353 (1985); Saik, et al., Nature 324:163-166 (1986)) prior toanalysis. Recent reviews of this subject have been presented by Caskey,Science 236:1223-1228 (1989) and by Landergren, et al., Science242:229-237 (1989). The detection of specific DNA sequence may beachieved by methods such as hybridization using specificoligonucleotides (Wallace, et al., Cold Spring Harbour Symp. Quant.Biol. 51:257-261 (1986)), direct DNA sequencing (Church, et al., PNAS(USA) 81:1991-1995 (1988)), the use of restriction enzymes (Flavell, etal., Cell 15:25 (1978); Geever, et al., PNAS (USA) 78:501 (1981))discrimination on the basis of electrophoretic mobility in gels withdenaturing reagent (Myers, et al:, Cold Spring Harbour Sym. Quant. Biol.51 :275-284 (1986)), RNase protection (Myers, R. M. et al., Science230:1242 (1985)), chemical cleavage (Cotton, et al., PNAS (USA)85:4397-4401 (1985)), and the ligase-mediated detection procedure(Landergren, et al., Science 241:1077 (1988)).

With respect to protein-based testing, antibodies can be generated tothe NF1 gene product using standard immunological techniques, fusionproteins or synthetic peptides as described herein. Monoclonalantibodies can also be produced using now conventional techniques suchas those described in Waldmann, T. A., Monoclonal Antibodies inDiagnosis and Therapy, Science 252:1657-1661 (1991) and Harlow, E. etal., Antibodies:A Laboratory Manual:Cold Spring Harbor, N.Y. (1988). Itwill also be appreciated that antibody fragments, i.e. Fab′ fragments,can be similarly employed. Immunoassays, for example ELISAs, in whichthe test sample is contacted with antibody and binding to the geneproduct detected, can provide a quick and efficient method ofdetermining the presence and quantity of NF1 gene product.

With the characterization of the NF1 gene product and its function,functional assays can also be used for NF1 diagnosis and screening andto monitor treatment. For example, enzymatic testing to determine levelsof gene function, rather than direct screening of the NF1 gene orproduct, can be employed. Testing of this nature has been utilized inother diseases and conditions, such as in Tay-Sachs. In the case of NF1,the NF1 protein can be assessed, for example, for its ability tocatalyze the conversion of ras protein from its GTP to its GDP form. SeeBallester, R. M. et al., Cell 63:851-859 (199); Martin, G. A. et al.,Cell 63:835-849 (1990).

Identification of the NF1 gene and its gene product also has therapeuticimplications. In conventional replacement therapy, gene product or itsfunctional equivalent is provided to the patient in therapeuticallyeffective amounts. NF1 protein can be purified using conventionaltechniques such as those described in Deutcher, M. (editor), Guide toProtein Purification. Meth. in Enzymol. Vol. 182 (1990). Sufficientamounts of gene product or protein for treatment can be obtained, forexample, through cultured cell systems or synthetic manufacture. Drugtherapies which stimulate or replace the gene product can also beemployed. Delivery vehicles and schemes can be specifically tailored tothe particular protein or drug being adminstered.

Treatment can also take the form of modulation of the function of adefective protein or by modification of another protein or step in thepathway in which NF1 participates in order to correct the physiologicalabnormality. For example, since NF1 appears to act as a brake on theeffects of activated ras protein, in the absence of NF1 (as would beexpected to occur in a tumor in an NF1 patient) a useful approach wouldbe to down-regulate ras. One method by which this could be accomplishedwould be through the use of an inhibitor of pharnesyl transferase. SeeGibbs, J. B., Cell 65:1-4 (1991).

Modulation of NF1 function can be accomplished by the use of therapeuticagents or drugs which can be designed to interact with different aspectsof NF1 protein structure or function. For example, a drug or antibodycan bind to a structural fold of the protein to correct a defectivestructure. Alternatively, a drug might bind to a specific functionalresidue and increase its affinity for a substrate or cofactor. Efficacyof a drug or agent can be identified by a screening program in whichmodulation is monitored in vitro in cell systems in which a defectiveNF1 protein is expressed. Alternatively, drugs can be designed tomodulate NF1 activity from knowledge of the structure and functioncorrelations of NF1 protein and from knowledge of the specific defect inthe various NF1 mutant proteins. See Capsey, et al., GeneticallyEngineered Human Therapeutic Drugs, Stockton Press, New York (1988).

Gene therapy using recombinant technology to deliver the gene into thepatient's cells or vectors which will supply the patient with geneproduct in vivo is also contemplated as within the scope of the presentinvention. Retroviruses have been considered a preferred vector forexperiments in somatic gene therapy, with a high efficiency of infectionand stable integration and expression (Orkin, et al., Prog. Med. Gene7:130 (1988)). For example, NF1 gene cDNA can be cloned into aretroviral vector and driven from either its endogenous promoter of fromthe retroviral LTR (long terminal repeat). Other delivery systems whichcan be utilized include adeno-associated virus (AAV) (Mclaughlin, etal., J. Virol. 62:1963 (1988), vaccinia virus (Moss, et al., Annu. Rev.Immunol. 5:305 (1987)), bovine papilloma virus (rasmussen, et al., Meth.Enzymol 139:642 (1987)), or member of the herpesvirus group such asEpstein-Barr virus (Margoiskee, et al., Mol. Cell. Biol. 8:2937 (1988)).Finally, since a defect in the NF1 gene results in the unbridledproliferation of nervous tissue, identification of the gene and geneproduct may be useful in developing treatments for non-NF1 tumors of thenervous system. Since the NF1 gene product appears to function as atumor suppressor, increasing the supply of the product may have abeneficial effect on such tumors. Conversely where increasedproliferation of nervous tissue would be advantageous, strategies todecrease production of the NF1 product might prove beneficial.

The foregoing discussion discloses and describes merely exemplaryembodiments of the present invention. One skilled in the art willreadily recognize from such discussion, and from the accompanyingdrawings and claims, that various changes, modifications and variationscan be made therein without departing from the spirit and scope of theinvention as defined in the following claims.

14 8937 base pairs nucleic acid single linear cDNA to mRNA NO NO HomoSapiens 17q11.2 misc_feature 1..8937 /note= “Entire length of sequenceoverlaps clones P5 and B3A” CDS 190..8646 misc_feature 8425..8646 /note=“219 nt Pstl-HindIII fragment designated pMAL.B3A” misc_feature382..1302 /note= “918 nt HpaI-PstI fragment designated pMAL.HF3A.P”misc_feature 382..3909 /note= “3523 nt Hpal-Xhol fragment designatedpMAL.HF3A.X” M.R. et al.Wallace Type 1 Neurofibromatosis Gene CorrectionScience 250 12/21/90 1749- 12/21-1990 1 FROM 1 TO 8937 M.R. etal.Wallace Type 1 Neurofibromatosis Gene Identification of a LargeTranscript in Three NF1 Patients Science 249 07/13/90 181-186 07/13-19901 FROM 1 TO 8937 1 CCCTTTCCCT CTCCCCCTCC CGCTCGGCGC TGACCCCCCATCCCCACCCC CGTGGGAACA 60 CTGGGAGCCT GCACTCCACA GACCCTCTCC TTGCCTCTTCCCTCACCTCA GCCTCCGCTC 120 CCCGCCCTCT TCCCGGCCCA GGGCGCCGGC CCACCCTTCCCTCCGCCGCC CCCCGGCCGC 180 GGGGAGGACA TGGCCGCGCA CAGGCCGGTG GAATGGGTCCAGGCCGTGGT CAGCCGCTTC 240 GACGAGCAGC TTCCAATAAA AACAGGACAG CAGAACACACATACCAAAGT CAGTACTGAG 300 CACAACAAGG AATGTCTAAT CAATATTTCC AAATACAAGTTTTCTTTGGT TATAAGCGGC 360 CTCACTACTA TTTTAAAGAA TGTTAACAAT ATGAGAATATTTGGAGAAGC TGCTGAAAAA 420 AATTTATATC TCTCTCAGTT GATTATATTG GATACACTGGAAAAATGTCT TGCTGGGCAA 480 CCAAAGGACA CAATGAGATT AGATGAAACG ATGCTGGTCAAACAGTTGCT GCCAGAAATC 540 TGCCATTTTC TTCACACCTG TCGTGAAGGA AACCAGCATGCAGCTGAACT TCGGAATTCT 600 GCCTCTGGGG TTTTATTTTC TCTCAGCTGC AACAACTTCAATGCAGTCTT TAGTCGCATT 660 TCTACCAGGT TACAGGAATT AACTGTTTGT TCAGAAGACAATGTTGATGT TCATGATATA 720 GAATTGTTAC AGTATATCAA TGTGGATTGT GCAAAATTAAAACGACTCCT GAAGGAAACA 780 GCATTTAAAT TTAAAGCCCT AAAGAAGGTT GCGCAGTTAGCAGTTATAAA TAGCCTGGAA 840 AAGGCATTTT GGAACTGGGT AGAAAATTAT CCAGATGAATTTACAAAACT GTACCAGATC 900 CCACAGACTG ATATGGCTGA ATGTGCAGAA AAGCTATTTGACTTGGTGGA TGGTTTTGCT 960 GAAAGCACCA AACGTAAAGC AGCAGTTTGG CCACTACAAATCATTCTCCT TATCTTGTGT 1020 CCAGAAATAA TCCAGGATAT ATCCAAAGAC GTGGTTGATGAAAACAACAT GAATAAGAAG 1080 TTATTTCTGG ACAGTCTACG AAAAGCTCTT GCTGGCCATGGAGGAAGTAG GCAGCTGACA 1140 GAAAGTGCTG CAATTGCCTG TGTCAAACTG TGTAAAGCAAGTACTTACAT CAATTGGGAA 1200 GATAACTCTG TCATTTTCCT ACTTGTTCAG TCCATGGTGGTTGATCTTAA GAACCTGCTT 1260 TTTAATCCAA GTAAGCCATT CTCAAGAGGC AGTCAGCCTGCAGATGTGGA TCTAATGATT 1320 GACTGCCTTG TTTCTTGCTT TCGTATAAGC CCTCACAACAACCAACACTT TAAGATCTGC 1380 CTGGCTCAGA ATTCACCTTC TACATTTCAC TATGTGCTGGTAAATTCACT CCATCGAATC 1440 ATCACCAATT CCGCATTGGA TTGGTGGCCT AAGATTGATGCTGTGTATTG TCACTCGGTT 1500 GAACTTCGAA ATATGTTTGG TGAAACACTT CATAAAGCAGTGCAAGGTTG TGGAGCACAC 1560 CCAGCAATAC GAATGGCCCC GAGTCTTACA TTTAAAGAAAAAGTAACAAG CCTTAAATTT 1620 AAAGAAAAAC CTACAGACCT GGAGACAAGA AGCTATAAGTATCTTCTCTT GTCCATAGTG 1680 AAACTAATTC ATGCAGATCC AAAGCTCTTG CTTTGTAATCCAAGAAAACA GGGGCCCGAA 1740 ACCCAAGGCA GTACAGCAGA ATTAATTACA GGGCTCGTCCAACTGGTCCC TCAGTCACAC 1800 ATGCCAGAGA TTGCTCAGGA AGCAATGGAG GCTCTGCTGGTTCTTCATCA GTTAGATAGC 1860 ATTGATTTGT GGAATCCTGA TGCTCCTGTA GAAACATTTTGGGAGATTAG CTCACAAATG 1920 CTTTTTTACA TCTGCAAGAA ATTAACTAGT CATCAAATGCTTAGTAGCAC AGAAATTCTC 1980 AAGTGGTTGC GGGAAATATT GATCTGCAGG AATAAATTTCTTCTTAAAAA TAAGCAGGCA 2040 GATAGAAGTT CCTGTCACTT TCTCCTTTTT TACGGGGTAGGATGTGATAT TCCTTCTAGT 2100 GGAAATACCA GTCAAATGTC CATGGATCAT GAAGAATTACTACGTACTCC TGGAGCCTCT 2160 CTCCGGAAGG GAAAAGGGAA CTCCTCTATG GATAGTGCAGCAGGATGCAG CGGAACCCCC 2220 CCAATTTGCC GACAAGCCCA GACCAAACTA GAAGTGGCCCTGTACATGTT TCTGTGGAAC 2280 CCTGACACTG AAGCTGTTCT GGTTGCCATG TCCTGTTTCCGCCACCTCTG TGAGGAAGCA 2340 GATATCCGGT GTGCGGTGGA TGAAGTGTCA GTGCATAACCTCTTGCCCAA CTATAACACA 2400 TTCATGGAGT TTGCCTCTGT CAGCAATATG ATGTCAACAGGAAGAGCAGC ACTTCAGAAA 2460 AGAGTGATGG CACTGCTGAG GCGCATTGAG CATCCCACTGCAGGAAACAC TGAGGCTTGG 2520 GAAGATACAC ATGCAAAATG GGAACAAGCA ACAAAGCTAATCCTTAACTA TCCAAAAGCC 2580 AAAATGGAAG ATGGCCAGGC TGCTGAAAGC CTTCACAAGACCATTGTTAA GAGGCGAATG 2640 TCCCATGTGA GTGGAGGAGG ATCCATAGAT TTGTCTGACACAGACTCCCT ACAGGAATGG 2700 ATCAACATGA CTGGCTTCCT TTGTGCCCTT GGAGGAGTGTGCCTCCAGCA GAGAAGCAAT 2760 TCTGGCCTGG CAACCTATAG CCCACCCATG GGTCCAGTCAGTGAACGTAA GGGTTCTATG 2820 ATTTCAGTGA TGTCTTCAGA GGGAAACGCA GATACACCTGTCAGCAAATT TATGGATCGG 2880 CTGTTGTCCT TAATGGTGTG TAACCATGAG AAAGTGGGACTTCAAATACG GACCAATGTT 2940 AAGGATCTGG TGGGTCTAGA ATTGAGTCCT GCTCTGTATCCAATGCTATT TAACAAATTG 3000 AAGAATACCA TCAGCAAGTT TTTTGACTCC CAAGGACAGGTTTTATTGAC TGATACCAAT 3060 ACTCAATTTG TAGAACAAAC CATAGCTATA ATGAATAACTTGCTAGATAA TCATACTGAA 3120 GGCAGCTCTG AACATCTAGG GCAAGCTAGC ATTGAAACAATGATGTTAAA TCTGGTCAGG 3180 TATGTTCGTG TGCTTGGGAA TATGGTCCAT GCAATTCAAATAAAAACGAA ACTGTGTCAA 3240 TTAGTTGAAG TAATGATGGC AAGGAGAGAT GACCTCTCATTTTGCCAAGA GATGAAATTT 3300 AGGAATAAGA TGGTAGAATA CCTGACAGAC TGGGTTATGGGAACATCAAA CCAAGCAGCA 3360 GATGATGATG TAAAATGTCT TACAAGAGAT TTGGACCAGGCAAGCATGGA AGCAGTAGTT 3420 TCACTTCTAG CTGGTCTCCC TCTGCAGCCT GAAGAAGGAGATGGTGTGGA ATTGATGGAA 3480 GCCAAATCAC AGTTATTTCT TAAATACTTC ACATTATTTATGAACCTTTT GAATGACTGC 3540 AGTGAAGTTG AAGATGAAAG TGCGCAAACA GGTGGCAGGAAACGTGGCAT GTCTCGGAGG 3600 CTGGCATCAC TGAGGCACTG TACGGTCCTT GCAATGTCAAACTTACTCAA TGCCAACGTA 3660 GACAGTGGTC TCATGCACTC CATAGGCTTA GGTTACCACAAGGATCTCCA GACAAGAGCT 3720 ACATTTATGG AAGTTCTGAC AAAAATCCTT CAACAAGGCACAGAATTTGA CACACTTGCA 3780 GAAACAGTAT TGGCTGATCG GTTTGAGAGA TTGGTGGAACTGGTCACAAT GATGGGTGAT 3840 CAAGGAGAAC TCCCTATAGC GATGGCTCTG GCCAATGTGGTTCCTTGTTC TCAGTGGGAT 3900 GAACTAGCTC GAGTTCTGGT TACTCTGTTT GATTCTCGGCATTTACTCTA CCAACTGCTC 3960 TGGAACATGT TTTCTAAAGA AGTAGAATTG GCAGACTCCATGCAGACTCT CTTCCGAGGC 4020 AACAGCTTGG CCAGTAAAAT AATGACATTC TGTTTCAAGGTATATGGTGC TACCTATCTA 4080 CAAAAACTCC TGGATCCTTT ATTACGAATT GTGATCACATCCTCTGATTG GCAACATGTT 4140 AGCTTTGAAG TGGATCCTAC CAGGTTAGAA CCATCAGAGAGCCTTGAGGA AAACCAGCGG 4200 AACCTCCTTC AGATGACTGA AAAGTTCTTC CATGCCATCATCAGTTCCTC CTCAGAATTC 4260 CCCCCTCAAC TTCGAAGTGT GTGCCACTGT TTATACCAGGTGGTTAGCCA GCGTTTCCCT 4320 CAGAACAGCA TCGGTGCAGT AGGAAGTGCC ATGTTCCTCAGATTTATCAA TCCTGCCATT 4380 GTCTCACCGT ATGAAGCAGG GATTTTAGAT AAAAAGCCACCACCTAGAAT CGAAAGGGGC 4440 TTGAAGTTAA TGTCAAAGAT ACTTCAGAGT ATTGCCAATCATGTTCTCTT CACAAAAGAA 4500 GAACATATGC GGCCTTTCAA TGATTTTGTG AAAAGCAACTTTGATGCAGC ACGCAGGTTT 4560 TTCCTTGATA TAGCATCTGA TTGTCCTACA AGTGATGCAGTAAATCATAG TCTTTCCTTC 4620 ATAAGTGACG GCAATGTGCT TGCTTTACAT CGTCTACTCTGGAACAATCA GGAGAAAATT 4680 GGGCAGTATC TTTCCAGCAA CAGGGATCAT AAAGCTGTTGGAAGACGACC TTTTGATAAG 4740 ATGGCAACAC TTCTTGCATA CCTGGGTCCT CCAGAGCACAAACCTGTGGC AGATACACAC 4800 TGGTCCAGCC TTAACCTTAC CAGTTCAAAG TTTGAGGAATTTATGACTAG GCATCAGGTA 4860 CATGAAAAAG AAGAATTCAA GGCTTTGAAA ACGTTAAGTATTTTCTACCA AGCTGGGACT 4920 TCCAAAGCTG GGAATCCTAT TTTTTATTAT GTTGCACGGAGGTTCAAAAC TGGTCAAATC 4980 AATGGTGATT TGCTGATATA CCATGTCTTA CTGACTTTAAAGCCATATTA TGCAAAGCCA 5040 TATGAAATTG TAGTGGACCT TACCCATACC GGGCCTAGCAATCGCTTTAA AACAGACTTT 5100 CTCTCTAAGT GGTTTGTTGT TTTTCCTGGC TTTGCTTACGACAACGTCTC CGCAGTCTAT 5160 ATCTATAACT GTAACTCCTG GGTCAGGGAG TACACCAAGTATCATGAGCG GCTGCTGACT 5220 GGCCTCAAAG GTAGCAAAAG GCTTGTTTTC ATAGACTGTCCTGGGAAACT GGCTGAGCAC 5280 ATAGAGCATG AACAACAGAA ACTACCTGCT GCCACCTTGGCTTTAGAAGA GGACCTGAAG 5340 GTATTCCACA ATGCTCTCAA GCTAGCTCAC AAAGACACCAAAGTTTCTAT TAAAGTTGGT 5400 TCTACTGCTG TCCAAGTAAC TTCAGCAGAG CGAACAAAAGTCCTAGGGCA ATCAGTCTTT 5460 CTAAATGACA TTTATTATGC TTCGGAAATT GAAGAAATCTGCCTAGTAGA TGAGAACCAG 5520 TTCACCTTAA CCATTGCAAA CCAGGGCACG CCGCTCACCTTCATGCACCA GGAGTGTGAA 5580 GCCATTGTCC AGTCTATCAT TCATATCCGG ACCCGCTGGGAACTGTCACA GCCCGACTCT 5640 ATCCCCCAAC ACACCAAGAT TCGGCCAAAA GATGTCCCTGGGACACTGCT CAATATCGCA 5700 TTACTTAATT TAGGCAGTTC TGACCCGAGT TTACGGTCAGCTGCCTATAA TCTTCTGTGT 5760 GCCTTAACTT GTACCTTTAA TTTAAAAATC GAGGGCCAGTTACTAGAGAC ATCAGGTTTA 5820 TGTATCCCTG CCAACAACAC CCTCTTTATT GTCTCTATTAGTAAGACACT GGCAGCCAAT 5880 GAGCCACACC TCACGTTAGA ATTTTTGGAA GAGTGTATTTCTGGATTTAG CAAATCTAGT 5940 ATTGAATTGA AACACCTTTG TTTGGAATAC ATGACTCCATGGCTGTCAAA TCTAGTTCGT 6000 TTTTGCAAGC ATAATGATGA TGCCAAACGA CAAAGAGTTACTGCTATTCT TGACAAGCTG 6060 ATAACAATGA CCATCAATGA AAAACAGATG TACCCATCTATTCAAGCAAA AATATGGGGA 6120 AGCCTTGGGC AGATTACAGA TCTGCTTGAT GTTGTACTAGACAGTTTCAT CAAAACCAGT 6180 GCAACAGGTG GCTTGGGATC AATAAAAGCT GAGGTGATGGCAGATACTGC TGTAGCTTTG 6240 GCTTCTGGAA ATGTGAAATT GGTTTCAAGC AAGGTTATTGGAAGGATGTG CAAAATAATT 6300 GACAAGACAT GCTTATCTCC AACTCCTACT TTAGAACAACATCTTATGTG GGATGATATT 6360 GCTATTTTAG CACGCTACAT GCTGATGCTG TCCTTCAACAATTCCCTTGA TGTGGCAGCT 6420 CATCTTCCCT ACCTCTTCCA CGTTGTTACT TTCTTAGTAGCCACAGGTCC GCTCTCCCTT 6480 AGAGCTTCCA CACATGGACT GGTCATTAAT ATCATTCACTCTCTGTGTAC TTGTTCACAG 6540 CTTCATTTTA GTGAAGAGAC CAAGCAAGTT TTGAGACTCAGTCTGACAGA GTTCTCATTA 6600 CCCAAATTTT ACTTGCTGTT TGGCATTAGC AAAGTCAAGTCAGCTGCTGT CATTGCCTTC 6660 CGTTCCAGTT ACCGGGACAG GTCATTCTCT CCTGGCTCCTATGAGAGAGA GACTTTTGCT 6720 TTGACATCCT TGGAAACAGT CACAGAAGCT TTGTTGGAGATCATGGAGGC ATGCATGAGA 6780 GATATTCCAA CGTGCAAGTG GCTGGACCAG TGGACAGAACTAGCTCAAAG ATTTGCATTC 6840 CAATATAATC CATCCCTGCA ACCAAGAGCT CTTGTTGTCTTTGGGTGTAT TAGCAAACGA 6900 GTGTCTCATG GGCAGATAAA GCAGATAATC CGTATTCTTAGCAAGGCACT TGAGAGTTGC 6960 TTAAAAGGAC CTGACACTTA CAACAGTCAA GTTCTGATAGAAGCTACAGT AATAGCACTA 7020 ACCAAATTAC AGCCACTTCT TAATAAGGAC TCGCCTCTGCACAAAGCCCT CTTTTGGGTA 7080 GCTGTGGCTG TGCTGCAGCT TGATGAGGTC AACTTGTATTCAGCAGGTAC CGCACTTCTT 7140 GAACAAAACC TGCATACTTT AGATAGTCTC CGTATATTCAATGACAAGAG TCCAGAGGAA 7200 GTATTTATGG CAATCCGGAA TCCTCTGGAG TGGCACTGCAAGCAAATGGA TCATTTTGTT 7260 GGACTCAATT TCAACTCTAA CTTTAACTTT GCATTGGTTGGACACCTTTT AAAAGGGTAC 7320 AGGCATCCTT CACCTGCTAT TGTTGCAAGA ACAGTCAGAATTTTACATAC ACTACTAACT 7380 CTGGTTAACA AACACAGAAA TTGTGACAAA TTTGAAGTGAATACACAGAG CGTGGCCTAC 7440 TTAGCAGCTT TACTTACAGT GTCTGAAGAA GTTCGAAGTCGCTGCAGCCT AAAACATAGA 7500 AAGTCACTTC TTCTTACTGA TATTTCAATG GAAAATGTTCCTATGGATAC ATATCCCATT 7560 CATCATGGTG ACCCTTCCTA TAGGACACTA AAGGAGACTCAGCCATGGTC CTCTCCCAAA 7620 GGTTCTGAAG GATACCTTGC AGCCACCTAT CCAACTGTCGGCCAGACCAG TCCCCGAGCC 7680 AGGAAATCCA TGAGCCTGGA CATGGGGCAA CCTTCTCAGGCCAACACTAA GAAGTTGCTT 7740 GGAACAAGGA AAAGTTTTGA TCACTTGATA TCAGACACAAAGGCTCCTAA AAGGCAAGAA 7800 ATGGAATCAG GGATCACAAC ACCCCCCAAA ATGAGGAGAGTAGCAGAAAC TGATTATGAA 7860 ATGGAAACTC AGAGGATTTC CTCATCACAA CAGCACCCACATTTACGTAA AGTTTCAGTG 7920 TCTGAATCAA ATGTTCTCTT GGATGAAGAA GTACTTACTGATCCGAAGAT CCAGGCGCTG 7980 CTTCTTACTG TTCTAGCTAC ACTGGTAAAA TATACCACAGATGAGTTTGA TCAACGAATT 8040 CTTTATGAAT ACTTAGCAGA GGCCAGTGTT GTGTTTCCCAAAGTCTTTCC TGTTGTGCAT 8100 AATTTGTTGG ACTCTAAGAT CAACACCCTG TTATCATTGTGCCAAGATCC AAATTTGTTA 8160 AATCCAATCC ATGGAATTGT GCAGAGTGTG GTGTACCATGAAGAATCCCC ACCACAATAC 8220 CAAACATCTT ACCTGCAAAG TTTTGGTTTT AATGGCTTGTGGCGGTTTGC AGGACCGTTT 8280 TCAAAGCAAA CACAAATTCC AGACTATGCT GAGCTTATTGTTAAGTTTCT TGATGCCTTG 8340 ATTGACACGT ACCTGCCTGG AATTGATGAA GAAACCAGTGAAGAATCCCT CCTGACTCCC 8400 ACATCTCCTT ACCCTCCTGC ACTGCAGAGC CAGCTTAGTATCACTGCCAA CCTTAACCTT 8460 TCTAATTCCA TGACCTCACT TGCAACTTCC CAGCATTCCCCAGGAATCGA CAAGGAGAAC 8520 GTTGAACTCT CCCCTACCAC TGGCCACTGT AACAGTGGACGAACTCGCCA CGGATCCGCA 8580 AGCCAAGTGC AGAAGCAAAG AAGCGCTGGC AGTTTCAAACGTAATAGCAT TAAGAAGATC 8640 GTGTGAAGCT TGCTTGCTTT CTTTTTTAAA ATCAACTTAACATGGGCTCT TCACTAGTGA 8700 CCCCTTCCCT GTCCTTGCCC TTTCCCCCCA TGTTGTAATGCTGCACTTCC TGTTTTATAA 8760 TGAACCCATC CGGTTTGCCA TGTTGCCAGA TGATCAACTCTTCGAAGCCT TGCCTAAATT 8820 TAATGCTGCC TTTTCTTTAA CTTTTTTTCT TCTACTTTTGGCGTGTATCT GGTATATGTA 8880 AGTGTTCAGA ACAACTGCAA AGAAAGTGGG AGGTCAGGAAACTTTTAACT GAGAAAT 8937 2818 amino acids amino acid single linear cDNAto mRNA NO NO Homo sapiens 17q11.2 Cleavage-site group(583..586,815..818, 2573..2576, 2810..2813) /note= “Potential cAMP-dependentprotein kinase recognition sites” Modified-site 2549..2556 /note=“Potential tyrosine phosphorylation site” Modified-site group(1264,1276, 1358, 1377, 1389, 1390, 1391, 1395, 1396, 1400, 1423, 1426, 1429,1430) /note= “Invariant residues within most statistically significantregions of similarity among the GAP family of proteins” Modified-sitegroup(1264..1290, 1345..1407, 1415..1430) /note= “Most statisticallysignificant regions of similarity among the GAP family of proteins”Modified-site 496 /note= “At variance with previously published sequencewhich shows an ATG methionine codon rather than an ATA isoleucine codon”Modified-site 1183 /note= “At variance with previously publishedsequence. Shows an CTG leucine codon rather than the previouslypublished CTC” Modified-site 1555 /note= “At variance with previouslypublished sequence. Lacks an extra CAT histidine condon after thisresidue” Modified-site (2771{circumflex over ( #2772)} ) /note=“Position of an 18 amino acid insertion(SEQ ID NO10) representing analternatively spliced product” Modified-site (1370{circumflex over ()}1371) /note= “Position of a 21 amino acid insertion representing analternatively spliced product” Domain 1125..1537 /note= “NF1 catalyticdomain” Modified-site 2746..2818 /note= “Corresponding amino acids forthe PstI-HindIII fragment designated pMAL.B3A” Modified-site 65..371/note= “Corresponding amino acids for the HpaI-PstI fragment designatedpMAL.HF3A.P” Modified-site 65..1240 /note= “Corresponding amino acidsfor the HpaI-XhoI fragment designated pMAL.HF3A.X” M.R. et al.WallaceType 1 Neurofibromatosis Gene Correction Science 250 12/21/90 1749-12/21-1990 2 FROM 1 TO 2818 M.R. et al.Wallace Type 1 NeurofibromatosisGene Identification of a Large Transcript in Three NF1 Patients Science249 07/13/90 181-186 07/13-1990 2 FROM 1 TO 2818 2 Met Ala Ala His ArgPro Val Glu Trp Val Gln Ala Val Val Ser Arg 1 5 10 15 Phe Asp Glu GlnLeu Pro Ile Lys Thr Gly Gln Gln Asn Thr His Thr 20 25 30 Lys Val Ser ThrGlu His Asn Lys Glu Cys Leu Ile Asn Ile Ser Lys 35 40 45 Tyr Lys Phe SerLeu Val Ile Ser Gly Leu Thr Thr Ile Leu Lys Asn 50 55 60 Val Asn Asn MetArg Ile Phe Gly Glu Ala Ala Glu Lys Asn Leu Tyr 65 70 75 80 Leu Ser GlnLeu Ile Ile Leu Asp Thr Leu Glu Lys Cys Leu Ala Gly 85 90 95 Gln Pro LysAsp Thr Met Arg Leu Asp Glu Thr Met Leu Val Lys Gln 100 105 110 Leu LeuPro Glu Ile Cys His Phe Leu His Thr Cys Arg Glu Gly Asn 115 120 125 GlnHis Ala Ala Glu Leu Arg Asn Ser Ala Ser Gly Val Leu Phe Ser 130 135 140Leu Ser Cys Asn Asn Phe Asn Ala Val Phe Ser Arg Ile Ser Thr Arg 145 150155 160 Leu Gln Glu Leu Thr Val Cys Ser Glu Asp Asn Val Asp Val His Asp165 170 175 Ile Glu Leu Leu Gln Tyr Ile Asn Val Asp Cys Ala Lys Leu LysArg 180 185 190 Leu Leu Lys Glu Thr Ala Phe Lys Phe Lys Ala Leu Lys LysVal Ala 195 200 205 Gln Leu Ala Val Ile Asn Ser Leu Glu Lys Ala Phe TrpAsn Trp Val 210 215 220 Glu Asn Tyr Pro Asp Glu Phe Thr Lys Leu Tyr GlnIle Pro Gln Thr 225 230 235 240 Asp Met Ala Glu Cys Ala Glu Lys Leu PheAsp Leu Val Asp Gly Phe 245 250 255 Ala Glu Ser Thr Lys Arg Lys Ala AlaVal Trp Pro Leu Gln Ile Ile 260 265 270 Leu Leu Ile Leu Cys Pro Glu IleIle Gln Asp Ile Ser Lys Asp Val 275 280 285 Val Asp Glu Asn Asn Met AsnLys Lys Leu Phe Leu Asp Ser Leu Arg 290 295 300 Lys Ala Leu Ala Gly HisGly Gly Ser Arg Gln Leu Thr Glu Ser Ala 305 310 315 320 Ala Ile Ala CysVal Lys Leu Cys Lys Ala Ser Thr Tyr Ile Asn Trp 325 330 335 Glu Asp AsnSer Val Ile Phe Leu Leu Val Gln Ser Met Val Val Asp 340 345 350 Leu LysAsn Leu Leu Phe Asn Pro Ser Lys Pro Phe Ser Arg Gly Ser 355 360 365 GlnPro Ala Asp Val Asp Leu Met Ile Asp Cys Leu Val Ser Cys Phe 370 375 380Arg Ile Ser Pro His Asn Asn Gln His Phe Lys Ile Cys Leu Ala Gln 385 390395 400 Asn Ser Pro Ser Thr Phe His Tyr Val Leu Val Asn Ser Leu His Arg405 410 415 Ile Ile Thr Asn Ser Ala Leu Asp Trp Trp Pro Lys Ile Asp AlaVal 420 425 430 Tyr Cys His Ser Val Glu Leu Arg Asn Met Phe Gly Glu ThrLeu His 435 440 445 Lys Ala Val Gln Gly Cys Gly Ala His Pro Ala Ile ArgMet Ala Pro 450 455 460 Ser Leu Thr Phe Lys Glu Lys Val Thr Ser Leu LysPhe Lys Glu Lys 465 470 475 480 Pro Thr Asp Leu Glu Thr Arg Ser Tyr LysTyr Leu Leu Leu Ser Ile 485 490 495 Val Lys Leu Ile His Ala Asp Pro LysLeu Leu Leu Cys Asn Pro Arg 500 505 510 Lys Gln Gly Pro Glu Thr Gln GlySer Thr Ala Glu Leu Ile Thr Gly 515 520 525 Leu Val Gln Leu Val Pro GlnSer His Met Pro Glu Ile Ala Gln Glu 530 535 540 Ala Met Glu Ala Leu LeuVal Leu His Gln Leu Asp Ser Ile Asp Leu 545 550 555 560 Trp Asn Pro AspAla Pro Val Glu Thr Phe Trp Glu Ile Ser Ser Gln 565 570 575 Met Leu PheTyr Ile Cys Lys Lys Leu Thr Ser His Gln Met Leu Ser 580 585 590 Ser ThrGlu Ile Leu Lys Trp Leu Arg Glu Ile Leu Ile Cys Arg Asn 595 600 605 LysPhe Leu Leu Lys Asn Lys Gln Ala Asp Arg Ser Ser Cys His Phe 610 615 620Leu Leu Phe Tyr Gly Val Gly Cys Asp Ile Pro Ser Ser Gly Asn Thr 625 630635 640 Ser Gln Met Ser Met Asp His Glu Glu Leu Leu Arg Thr Pro Gly Ala645 650 655 Ser Leu Arg Lys Gly Lys Gly Asn Ser Ser Met Asp Ser Ala AlaGly 660 665 670 Cys Ser Gly Thr Pro Pro Ile Cys Arg Gln Ala Gln Thr LysLeu Glu 675 680 685 Val Ala Leu Tyr Met Phe Leu Trp Asn Pro Asp Thr GluAla Val Leu 690 695 700 Val Ala Met Ser Cys Phe Arg His Leu Cys Glu GluAla Asp Ile Arg 705 710 715 720 Cys Ala Val Asp Glu Val Ser Val His AsnLeu Leu Pro Asn Tyr Asn 725 730 735 Thr Phe Met Glu Phe Ala Ser Val SerAsn Met Met Ser Thr Gly Arg 740 745 750 Ala Ala Leu Gln Lys Arg Val MetAla Leu Leu Arg Arg Ile Glu His 755 760 765 Pro Thr Ala Gly Asn Thr GluAla Trp Glu Asp Thr His Ala Lys Trp 770 775 780 Glu Gln Ala Thr Lys LeuIle Leu Asn Tyr Pro Lys Ala Lys Met Glu 785 790 795 800 Asp Gly Gln AlaAla Glu Ser Leu His Lys Thr Ile Val Lys Arg Arg 805 810 815 Met Ser HisVal Ser Gly Gly Gly Ser Ile Asp Leu Ser Asp Thr Asp 820 825 830 Ser LeuGln Glu Trp Ile Asn Met Thr Gly Phe Leu Cys Ala Leu Gly 835 840 845 GlyVal Cys Leu Gln Gln Arg Ser Asn Ser Gly Leu Ala Thr Tyr Ser 850 855 860Pro Pro Met Gly Pro Val Ser Glu Arg Lys Gly Ser Met Ile Ser Val 865 870875 880 Met Ser Ser Glu Gly Asn Ala Asp Thr Pro Val Ser Lys Phe Met Asp885 890 895 Arg Leu Leu Ser Leu Met Val Cys Asn His Glu Lys Val Gly LeuGln 900 905 910 Ile Arg Thr Asn Val Lys Asp Leu Val Gly Leu Glu Leu SerPro Ala 915 920 925 Leu Tyr Pro Met Leu Phe Asn Lys Leu Lys Asn Thr IleSer Lys Phe 930 935 940 Phe Asp Ser Gln Gly Gln Val Leu Leu Thr Asp ThrAsn Thr Gln Phe 945 950 955 960 Val Glu Gln Thr Ile Ala Ile Met Asn AsnLeu Leu Asp Asn His Thr 965 970 975 Glu Gly Ser Ser Glu His Leu Gly GlnAla Ser Ile Glu Thr Met Met 980 985 990 Leu Asn Leu Val Arg Tyr Val ArgVal Leu Gly Asn Met Val His Ala 995 1000 1005 Ile Gln Ile Lys Thr LysLeu Cys Gln Leu Val Glu Val Met Met Ala 1010 1015 1020 Arg Arg Asp AspLeu Ser Phe Cys Gln Glu Met Lys Phe Arg Asn Lys 1025 1030 1035 1040 MetVal Glu Tyr Leu Thr Asp Trp Val Met Gly Thr Ser Asn Gln Ala 1045 10501055 Ala Asp Asp Asp Val Lys Cys Leu Thr Arg Asp Leu Asp Gln Ala Ser1060 1065 1070 Met Glu Ala Val Val Ser Leu Leu Ala Gly Leu Pro Leu GlnPro Glu 1075 1080 1085 Glu Gly Asp Gly Val Glu Leu Met Glu Ala Lys SerGln Leu Phe Leu 1090 1095 1100 Lys Tyr Phe Thr Leu Phe Met Asn Leu LeuAsn Asp Cys Ser Glu Val 1105 1110 1115 1120 Glu Asp Glu Ser Ala Gln ThrGly Gly Arg Lys Arg Gly Met Ser Arg 1125 1130 1135 Arg Leu Ala Ser LeuArg His Cys Thr Val Leu Ala Met Ser Asn Leu 1140 1145 1150 Leu Asn AlaAsn Val Asp Ser Gly Leu Met His Ser Ile Gly Leu Gly 1155 1160 1165 TyrHis Lys Asp Leu Gln Thr Arg Ala Thr Phe Met Glu Val Leu Thr 1170 11751180 Lys Ile Leu Gln Gln Gly Thr Glu Phe Asp Thr Leu Ala Glu Thr Val1185 1190 1195 1200 Leu Ala Asp Arg Phe Glu Arg Leu Val Glu Leu Val ThrMet Met Gly 1205 1210 1215 Asp Gln Gly Glu Leu Pro Ile Ala Met Ala LeuAla Asn Val Val Pro 1220 1225 1230 Cys Ser Gln Trp Asp Glu Leu Ala ArgVal Leu Val Thr Leu Phe Asp 1235 1240 1245 Ser Arg His Leu Leu Tyr GlnLeu Leu Trp Asn Met Phe Ser Lys Glu 1250 1255 1260 Val Glu Leu Ala AspSer Met Gln Thr Leu Phe Arg Gly Asn Ser Leu 1265 1270 1275 1280 Ala SerLys Ile Met Thr Phe Cys Phe Lys Val Tyr Gly Ala Thr Tyr 1285 1290 1295Leu Gln Lys Leu Leu Asp Pro Leu Leu Arg Ile Val Ile Thr Ser Ser 13001305 1310 Asp Trp Gln His Val Ser Phe Glu Val Asp Pro Thr Arg Leu GluPro 1315 1320 1325 Ser Glu Ser Leu Glu Glu Asn Gln Arg Asn Leu Leu GlnMet Thr Glu 1330 1335 1340 Lys Phe Phe His Ala Ile Ile Ser Ser Ser SerGlu Phe Pro Pro Gln 1345 1350 1355 1360 Leu Arg Ser Val Cys His Cys LeuTyr Gln Val Val Ser Gln Arg Phe 1365 1370 1375 Pro Gln Asn Ser Ile GlyAla Val Gly Ser Ala Met Phe Leu Arg Phe 1380 1385 1390 Ile Asn Pro AlaIle Val Ser Pro Tyr Glu Ala Gly Ile Leu Asp Lys 1395 1400 1405 Lys ProPro Pro Arg Ile Glu Arg Gly Leu Lys Leu Met Ser Lys Ile 1410 1415 1420Leu Gln Ser Ile Ala Asn His Val Leu Phe Thr Lys Glu Glu His Met 14251430 1435 1440 Arg Pro Phe Asn Asp Phe Val Lys Ser Asn Phe Asp Ala AlaArg Arg 1445 1450 1455 Phe Phe Leu Asp Ile Ala Ser Asp Cys Pro Thr SerAsp Ala Val Asn 1460 1465 1470 His Ser Leu Ser Phe Ile Ser Asp Gly AsnVal Leu Ala Leu His Arg 1475 1480 1485 Leu Leu Trp Asn Asn Gln Glu LysIle Gly Gln Tyr Leu Ser Ser Asn 1490 1495 1500 Arg Asp His Lys Ala ValGly Arg Arg Pro Phe Asp Lys Met Ala Thr 1505 1510 1515 1520 Leu Leu AlaTyr Leu Gly Pro Pro Glu His Lys Pro Val Ala Asp Thr 1525 1530 1535 HisTrp Ser Ser Leu Asn Leu Thr Ser Ser Lys Phe Glu Glu Phe Met 1540 15451550 Thr Arg His Gln Val His Glu Lys Glu Glu Phe Lys Ala Leu Lys Thr1555 1560 1565 Leu Ser Ile Phe Tyr Gln Ala Gly Thr Ser Lys Ala Gly AsnPro Ile 1570 1575 1580 Phe Tyr Tyr Val Ala Arg Arg Phe Lys Thr Gly GlnIle Asn Gly Asp 1585 1590 1595 1600 Leu Leu Ile Tyr His Val Leu Leu ThrLeu Lys Pro Tyr Tyr Ala Lys 1605 1610 1615 Pro Tyr Glu Ile Val Val AspLeu Thr His Thr Gly Pro Ser Asn Arg 1620 1625 1630 Phe Lys Thr Asp PheLeu Ser Lys Trp Phe Val Val Phe Pro Gly Phe 1635 1640 1645 Ala Tyr AspAsn Val Ser Ala Val Tyr Ile Tyr Asn Cys Asn Ser Trp 1650 1655 1660 ValArg Glu Tyr Thr Lys Tyr His Glu Arg Leu Leu Thr Gly Leu Lys 1665 16701675 1680 Gly Ser Lys Arg Leu Val Phe Ile Asp Cys Pro Gly Lys Leu AlaGlu 1685 1690 1695 His Ile Glu His Glu Gln Gln Lys Leu Pro Ala Ala ThrLeu Ala Leu 1700 1705 1710 Glu Glu Asp Leu Lys Val Phe His Asn Ala LeuLys Leu Ala His Lys 1715 1720 1725 Asp Thr Lys Val Ser Ile Lys Val GlySer Thr Ala Val Gln Val Thr 1730 1735 1740 Ser Ala Glu Arg Thr Lys ValLeu Gly Gln Ser Val Phe Leu Asn Asp 1745 1750 1755 1760 Ile Tyr Tyr AlaSer Glu Ile Glu Glu Ile Cys Leu Val Asp Glu Asn 1765 1770 1775 Gln PheThr Leu Thr Ile Ala Asn Gln Gly Thr Pro Leu Thr Phe Met 1780 1785 1790His Gln Glu Cys Glu Ala Ile Val Gln Ser Ile Ile His Ile Arg Thr 17951800 1805 Arg Trp Glu Leu Ser Gln Pro Asp Ser Ile Pro Gln His Thr LysIle 1810 1815 1820 Arg Pro Lys Asp Val Pro Gly Thr Leu Leu Asn Ile AlaLeu Leu Asn 1825 1830 1835 1840 Leu Gly Ser Ser Asp Pro Ser Leu Arg SerAla Ala Tyr Asn Leu Leu 1845 1850 1855 Cys Ala Leu Thr Cys Thr Phe AsnLeu Lys Ile Glu Gly Gln Leu Leu 1860 1865 1870 Glu Thr Ser Gly Leu CysIle Pro Ala Asn Asn Thr Leu Phe Ile Val 1875 1880 1885 Ser Ile Ser LysThr Leu Ala Ala Asn Glu Pro His Leu Thr Leu Glu 1890 1895 1900 Phe LeuGlu Glu Cys Ile Ser Gly Phe Ser Lys Ser Ser Ile Glu Leu 1905 1910 19151920 Lys His Leu Cys Leu Glu Tyr Met Thr Pro Trp Leu Ser Asn Leu Val1925 1930 1935 Arg Phe Cys Lys His Asn Asp Asp Ala Lys Arg Gln Arg ValThr Ala 1940 1945 1950 Ile Leu Asp Lys Leu Ile Thr Met Thr Ile Asn GluLys Gln Met Tyr 1955 1960 1965 Pro Ser Ile Gln Ala Lys Ile Trp Gly SerLeu Gly Gln Ile Thr Asp 1970 1975 1980 Leu Leu Asp Val Val Leu Asp SerPhe Ile Lys Thr Ser Ala Thr Gly 1985 1990 1995 2000 Gly Leu Gly Ser IleLys Ala Glu Val Met Ala Asp Thr Ala Val Ala 2005 2010 2015 Leu Ala SerGly Asn Val Lys Leu Val Ser Ser Lys Val Ile Gly Arg 2020 2025 2030 MetCys Lys Ile Ile Asp Lys Thr Cys Leu Ser Pro Thr Pro Thr Leu 2035 20402045 Glu Gln His Leu Met Trp Asp Asp Ile Ala Ile Leu Ala Arg Tyr Met2050 2055 2060 Leu Met Leu Ser Phe Asn Asn Ser Leu Asp Val Ala Ala HisLeu Pro 2065 2070 2075 2080 Tyr Leu Phe His Val Val Thr Phe Leu Val AlaThr Gly Pro Leu Ser 2085 2090 2095 Leu Arg Ala Ser Thr His Gly Leu ValIle Asn Ile Ile His Ser Leu 2100 2105 2110 Cys Thr Cys Ser Gln Leu HisPhe Ser Glu Glu Thr Lys Gln Val Leu 2115 2120 2125 Arg Leu Ser Leu ThrGlu Phe Ser Leu Pro Lys Phe Tyr Leu Leu Phe 2130 2135 2140 Gly Ile SerLys Val Lys Ser Ala Ala Val Ile Ala Phe Arg Ser Ser 2145 2150 2155 2160Tyr Arg Asp Arg Ser Phe Ser Pro Gly Ser Tyr Glu Arg Glu Thr Phe 21652170 2175 Ala Leu Thr Ser Leu Glu Thr Val Thr Glu Ala Leu Leu Glu IleMet 2180 2185 2190 Glu Ala Cys Met Arg Asp Ile Pro Thr Cys Lys Trp LeuAsp Gln Trp 2195 2200 2205 Thr Glu Leu Ala Gln Arg Phe Ala Phe Gln TyrAsn Pro Ser Leu Gln 2210 2215 2220 Pro Arg Ala Leu Val Val Phe Gly CysIle Ser Lys Arg Val Ser His 2225 2230 2235 2240 Gly Gln Ile Lys Gln IleIle Arg Ile Leu Ser Lys Ala Leu Glu Ser 2245 2250 2255 Cys Leu Lys GlyPro Asp Thr Tyr Asn Ser Gln Val Leu Ile Glu Ala 2260 2265 2270 Thr ValIle Ala Leu Thr Lys Leu Gln Pro Leu Leu Asn Lys Asp Ser 2275 2280 2285Pro Leu His Lys Ala Leu Phe Trp Val Ala Val Ala Val Leu Gln Leu 22902295 2300 Asp Glu Val Asn Leu Tyr Ser Ala Gly Thr Ala Leu Leu Glu GlnAsn 2305 2310 2315 2320 Leu His Thr Leu Asp Ser Leu Arg Ile Phe Asn AspLys Ser Pro Glu 2325 2330 2335 Glu Val Phe Met Ala Ile Arg Asn Pro LeuGlu Trp His Cys Lys Gln 2340 2345 2350 Met Asp His Phe Val Gly Leu AsnPhe Asn Ser Asn Phe Asn Phe Ala 2355 2360 2365 Leu Val Gly His Leu LeuLys Gly Tyr Arg His Pro Ser Pro Ala Ile 2370 2375 2380 Val Ala Arg ThrVal Arg Ile Leu His Thr Leu Leu Thr Leu Val Asn 2385 2390 2395 2400 LysHis Arg Asn Cys Asp Lys Phe Glu Val Asn Thr Gln Ser Val Ala 2405 24102415 Tyr Leu Ala Ala Leu Leu Thr Val Ser Glu Glu Val Arg Ser Arg Cys2420 2425 2430 Ser Leu Lys His Arg Lys Ser Leu Leu Leu Thr Asp Ile SerMet Glu 2435 2440 2445 Asn Val Pro Met Asp Thr Tyr Pro Ile His His GlyAsp Pro Ser Tyr 2450 2455 2460 Arg Thr Leu Lys Glu Thr Gln Pro Trp SerSer Pro Lys Gly Ser Glu 2465 2470 2475 2480 Gly Tyr Leu Ala Ala Thr TyrPro Thr Val Gly Gln Thr Ser Pro Arg 2485 2490 2495 Ala Arg Lys Ser MetSer Leu Asp Met Gly Gln Pro Ser Gln Ala Asn 2500 2505 2510 Thr Lys LysLeu Leu Gly Thr Arg Lys Ser Phe Asp His Leu Ile Ser 2515 2520 2525 AspThr Lys Ala Pro Lys Arg Gln Glu Met Glu Ser Gly Ile Thr Thr 2530 25352540 Pro Pro Lys Met Arg Arg Val Ala Glu Thr Asp Tyr Glu Met Glu Thr2545 2550 2555 2560 Gln Arg Ile Ser Ser Ser Gln Gln His Pro His Leu ArgLys Val Ser 2565 2570 2575 Val Ser Glu Ser Asn Val Leu Leu Asp Glu GluVal Leu Thr Asp Pro 2580 2585 2590 Lys Ile Gln Ala Leu Leu Leu Thr ValLeu Ala Thr Leu Val Lys Tyr 2595 2600 2605 Thr Thr Asp Glu Phe Asp GlnArg Ile Leu Tyr Glu Tyr Leu Ala Glu 2610 2615 2620 Ala Ser Val Val PhePro Lys Val Phe Pro Val Val His Asn Leu Leu 2625 2630 2635 2640 Asp SerLys Ile Asn Thr Leu Leu Ser Leu Cys Gln Asp Pro Asn Leu 2645 2650 2655Leu Asn Pro Ile His Gly Ile Val Gln Ser Val Val Tyr His Glu Glu 26602665 2670 Ser Pro Pro Gln Tyr Gln Thr Ser Tyr Leu Gln Ser Phe Gly PheAsn 2675 2680 2685 Gly Leu Trp Arg Phe Ala Gly Pro Phe Ser Lys Gln ThrGln Ile Pro 2690 2695 2700 Asp Tyr Ala Glu Leu Ile Val Lys Phe Leu AspAla Leu Ile Asp Thr 2705 2710 2715 2720 Tyr Leu Pro Gly Ile Asp Glu GluThr Ser Glu Glu Ser Leu Leu Thr 2725 2730 2735 Pro Thr Ser Pro Tyr ProPro Ala Leu Gln Ser Gln Leu Ser Ile Thr 2740 2745 2750 Ala Asn Leu AsnLeu Ser Asn Ser Met Thr Ser Leu Ala Thr Ser Gln 2755 2760 2765 His SerPro Gly Ile Asp Lys Glu Asn Val Glu Leu Ser Pro Thr Thr 2770 2775 2780Gly His Cys Asn Ser Gly Arg Thr Arg His Gly Ser Ala Ser Gln Val 27852790 2795 2800 Gln Lys Gln Arg Ser Ala Gly Ser Phe Lys Arg Asn Ser IleLys Lys 2805 2810 2815 Ile Val 2012 base pairs nucleic acid singlelinear cDNA to mRNA NO NO C-terminal Homo Sapiens several overlappingcDNA′s 17q11.2 CDS 1..1833 misc_feature group(34..36, 1642..1644) /note=“Potential N-glycosylation site” misc_signal group(859..873, 967..981,1012..1026) /note= “Possible nuclear localization signals” misc_feature1210..1236 /note= “PCR primer A” misc_feature 1441..1464 /note= “PCRprimer C” misc_feature 1594..1620 /note= “PCR primer B” misc_feature1914..1939 /note= “PCR primer D” M.R. et al.Wallace Type 1Neurofibromatosis Gene Science 250 12/21/90 1749- 12/21-1990 3 FROM 1 TO2012 M.R. et al.Wallace Type 1 Neurofibromatosis Gene Identification ofa Large Transcript in Three NF1 Patients Science 249 07/13/90 181-18607/13-1990 3 FROM 1 TO 2012 3 ACA GAA CTA GCT CAA AGA TTT GCA TTC CAATAT AAT CCA TCC CTG CAA 48 Thr Glu Leu Ala Gln Arg Phe Ala Phe Gln TyrAsn Pro Ser Leu Gln 1 5 10 15 CCA AGA GCT CTT GTT GTC TTT GGG TGT ATTAGC AAA CGA GTG TCT CAT 96 Pro Arg Ala Leu Val Val Phe Gly Cys Ile SerLys Arg Val Ser His 20 25 30 GGG CAG ATA AAG CAG ATA ATC CGT ATT CTT AGCAAG GCA CTT GAG AGT 144 Gly Gln Ile Lys Gln Ile Ile Arg Ile Leu Ser LysAla Leu Glu Ser 35 40 45 TGC TTA AAA GGA CCT GAC ACT TAC AAC AGT CAA GTTCTG ATA GAA GCT 192 Cys Leu Lys Gly Pro Asp Thr Tyr Asn Ser Gln Val LeuIle Glu Ala 50 55 60 ACA GTA ATA GCA CTA ACC AAA TTA CAG CCA CTT CTT AATAAG GAC TCG 240 Thr Val Ile Ala Leu Thr Lys Leu Gln Pro Leu Leu Asn LysAsp Ser 65 70 75 80 CCT CTG CAC AAA GCC CTC TTT TGG GTA GCT GTG GCT GTGCTG CAG CTT 288 Pro Leu His Lys Ala Leu Phe Trp Val Ala Val Ala Val LeuGln Leu 85 90 95 GAT GAG GTC AAC TTG TAT TCA GCA GGT ACC GCA CTT CTT GAACAA AAC 336 Asp Glu Val Asn Leu Tyr Ser Ala Gly Thr Ala Leu Leu Glu GlnAsn 100 105 110 CTG CAT ACT TTA GAT AGT CTC CGT ATA TTC AAT GAC AAG AGTCCA GAG 384 Leu His Thr Leu Asp Ser Leu Arg Ile Phe Asn Asp Lys Ser ProGlu 115 120 125 GAA GTA TTT ATG GCA ATC CGG AAT CCT CTG GAG TGG CAC TGCAAG CAA 432 Glu Val Phe Met Ala Ile Arg Asn Pro Leu Glu Trp His Cys LysGln 130 135 140 ATG GAT CAT TTT GTT GGA CTC AAT TTC AAC TCT AAC TTT AACTTT GCA 480 Met Asp His Phe Val Gly Leu Asn Phe Asn Ser Asn Phe Asn PheAla 145 150 155 160 TTG GTT GGA CAC CTT TTA AAA GGG TAC AGG CAT CCT TCACCT GCT ATT 528 Leu Val Gly His Leu Leu Lys Gly Tyr Arg His Pro Ser ProAla Ile 165 170 175 GTT GCA AGA ACA GTC AGA ATT TTA CAT ACA CTA CTA ACTCTG GTT AAC 576 Val Ala Arg Thr Val Arg Ile Leu His Thr Leu Leu Thr LeuVal Asn 180 185 190 AAA CAC AGA AAT TGT GAC AAA TTT GAA GTG AAT ACA CAGAGC GTG GCC 624 Lys His Arg Asn Cys Asp Lys Phe Glu Val Asn Thr Gln SerVal Ala 195 200 205 TAC TTA GCA GCT TTA CTT ACA GTG TCT GAA GAA GTT CGAAGT CGC TGC 672 Tyr Leu Ala Ala Leu Leu Thr Val Ser Glu Glu Val Arg SerArg Cys 210 215 220 AGC CTA AAA CAT AGA AAG TCA CTT CTT CTT ACT GAT ATTTCA ATG GAA 720 Ser Leu Lys His Arg Lys Ser Leu Leu Leu Thr Asp Ile SerMet Glu 225 230 235 240 AAT GTT CCT ATG GAT ACA TAT CCC ATT CAT CAT GGTGAC CCT TCC TAT 768 Asn Val Pro Met Asp Thr Tyr Pro Ile His His Gly AspPro Ser Tyr 245 250 255 AGG ACA CTA AAG GAG ACT CAG CCA TGG TCC TCT CCCAAA GGT TCT GAA 816 Arg Thr Leu Lys Glu Thr Gln Pro Trp Ser Ser Pro LysGly Ser Glu 260 265 270 GGA TAC CTT GCA GCC ACC TAT CCA ACT GTC GGC CAGACC AGT CCC CGA 864 Gly Tyr Leu Ala Ala Thr Tyr Pro Thr Val Gly Gln ThrSer Pro Arg 275 280 285 GCC AGG AAA TCC ATG AGC CTG GAC ATG GGG CAA CCTTCT CAG GCC AAC 912 Ala Arg Lys Ser Met Ser Leu Asp Met Gly Gln Pro SerGln Ala Asn 290 295 300 ACT AAG AAG TTG CTT GGA ACA AGG AAA AGT TTT GATCAC TTG ATA TCA 960 Thr Lys Lys Leu Leu Gly Thr Arg Lys Ser Phe Asp HisLeu Ile Ser 305 310 315 320 GAC ACA AAG GCT CCT AAA AGG CAA GAA ATG GAATCA GGG ATC ACA ACA 1008 Asp Thr Lys Ala Pro Lys Arg Gln Glu Met Glu SerGly Ile Thr Thr 325 330 335 CCC CCC AAA ATG AGG AGA GTA GCA GAA ACT GATTAT GAA ATG GAA ACT 1056 Pro Pro Lys Met Arg Arg Val Ala Glu Thr Asp TyrGlu Met Glu Thr 340 345 350 CAG AGG ATT TCC TCA TCA CAA CAG CAC CCA CATTTA CGT AAA GTT TCA 1104 Gln Arg Ile Ser Ser Ser Gln Gln His Pro His LeuArg Lys Val Ser 355 360 365 GTG TCT GAA TCA AAT GTT CTC TTG GAT GAA GAAGTA CTT ACT GAT CCG 1152 Val Ser Glu Ser Asn Val Leu Leu Asp Glu Glu ValLeu Thr Asp Pro 370 375 380 AAG ATC CAG GCG CTG CTT CTT ACT GTT CTA GCTACA CTG GTA AAA TAT 1200 Lys Ile Gln Ala Leu Leu Leu Thr Val Leu Ala ThrLeu Val Lys Tyr 385 390 395 400 ACC ACA GAT GAG TTT GAT CAA CGA ATT CTTTAT GAA TAC TTA GCA GAG 1248 Thr Thr Asp Glu Phe Asp Gln Arg Ile Leu TyrGlu Tyr Leu Ala Glu 405 410 415 GCC AGT GTT GTG TTT CCC AAA GTC TTT CCTGTT GTG CAT AAT TTG TTG 1296 Ala Ser Val Val Phe Pro Lys Val Phe Pro ValVal His Asn Leu Leu 420 425 430 GAC TCT AAG ATC AAC ACC CTG TTA TCA TTGTGC CAA GAT CCA AAT TTG 1344 Asp Ser Lys Ile Asn Thr Leu Leu Ser Leu CysGln Asp Pro Asn Leu 435 440 445 TTA AAT CCA ATC CAT GGA ATT GTG CAG AGTGTG GTG TAC CAT GAA GAA 1392 Leu Asn Pro Ile His Gly Ile Val Gln Ser ValVal Tyr His Glu Glu 450 455 460 TCC CCA CCA CAA TAC CAA ACA TCT TAC CTGCAA AGT TTT GGT TTT AAT 1440 Ser Pro Pro Gln Tyr Gln Thr Ser Tyr Leu GlnSer Phe Gly Phe Asn 465 470 475 480 GGC TTG TGG CGG TTT GCA GGA CCG TTTTCA AAG CAA ACA CAA ATT CCA 1488 Gly Leu Trp Arg Phe Ala Gly Pro Phe SerLys Gln Thr Gln Ile Pro 485 490 495 GAC TAT GCT GAG CTT ATT GTT AAG TTTCTT GAT GCC TTG ATT GAC ACG 1536 Asp Tyr Ala Glu Leu Ile Val Lys Phe LeuAsp Ala Leu Ile Asp Thr 500 505 510 TAC CTG CCT GGA ATT GAT GAA GAA ACCAGT GAA GAA TCC CTC CTG ACT 1584 Tyr Leu Pro Gly Ile Asp Glu Glu Thr SerGlu Glu Ser Leu Leu Thr 515 520 525 CCC ACA TCT CCT TAC CCT CCT GCA CTGCAG AGC CAG CTT AGT ATC ACT 1632 Pro Thr Ser Pro Tyr Pro Pro Ala Leu GlnSer Gln Leu Ser Ile Thr 530 535 540 GCC AAC CTT AAC CTT TCT AAT TCC ATGACC TCA CTT GCA ACT TCC CAG 1680 Ala Asn Leu Asn Leu Ser Asn Ser Met ThrSer Leu Ala Thr Ser Gln 545 550 555 560 CAT TCC CCA GGA ATC GAC AAG GAGAAC GTT GAA CTC TCC CCT ACC ACT 1728 His Ser Pro Gly Ile Asp Lys Glu AsnVal Glu Leu Ser Pro Thr Thr 565 570 575 GGC CAC TGT AAC AGT GGA CGA ACTCGC CAC GGA TCC GCA AGC CAA GTG 1776 Gly His Cys Asn Ser Gly Arg Thr ArgHis Gly Ser Ala Ser Gln Val 580 585 590 CAG AAG CAA AGA AGC GCT GGC AGTTTC AAA CGT AAT AGC ATT AAG AAG 1824 Gln Lys Gln Arg Ser Ala Gly Ser PheLys Arg Asn Ser Ile Lys Lys 595 600 605 ATC GTG TGA AGCTTGCTTGCTTTCTTTTT TAAAATCAAC TTAACATGGG 1873 Ile Val * 610 CTCTTCACTAGTGACCCCTT CCCTGTCCTT GCCCTTTCCC CCCATGTTGT AATGCTGCAC 1933 TTCCTGTTTTATAATGAACC CATCCGGTTT GCCATGTTGC CAGATGATCA ACTCTTCGAA 1993 GCCTTGCCTAAATTTAATG 2012 610 amino acids amino acid linear protein unknown 4 ThrGlu Leu Ala Gln Arg Phe Ala Phe Gln Tyr Asn Pro Ser Leu Gln 1 5 10 15Pro Arg Ala Leu Val Val Phe Gly Cys Ile Ser Lys Arg Val Ser His 20 25 30Gly Gln Ile Lys Gln Ile Ile Arg Ile Leu Ser Lys Ala Leu Glu Ser 35 40 45Cys Leu Lys Gly Pro Asp Thr Tyr Asn Ser Gln Val Leu Ile Glu Ala 50 55 60Thr Val Ile Ala Leu Thr Lys Leu Gln Pro Leu Leu Asn Lys Asp Ser 65 70 7580 Pro Leu His Lys Ala Leu Phe Trp Val Ala Val Ala Val Leu Gln Leu 85 9095 Asp Glu Val Asn Leu Tyr Ser Ala Gly Thr Ala Leu Leu Glu Gln Asn 100105 110 Leu His Thr Leu Asp Ser Leu Arg Ile Phe Asn Asp Lys Ser Pro Glu115 120 125 Glu Val Phe Met Ala Ile Arg Asn Pro Leu Glu Trp His Cys LysGln 130 135 140 Met Asp His Phe Val Gly Leu Asn Phe Asn Ser Asn Phe AsnPhe Ala 145 150 155 160 Leu Val Gly His Leu Leu Lys Gly Tyr Arg His ProSer Pro Ala Ile 165 170 175 Val Ala Arg Thr Val Arg Ile Leu His Thr LeuLeu Thr Leu Val Asn 180 185 190 Lys His Arg Asn Cys Asp Lys Phe Glu ValAsn Thr Gln Ser Val Ala 195 200 205 Tyr Leu Ala Ala Leu Leu Thr Val SerGlu Glu Val Arg Ser Arg Cys 210 215 220 Ser Leu Lys His Arg Lys Ser LeuLeu Leu Thr Asp Ile Ser Met Glu 225 230 235 240 Asn Val Pro Met Asp ThrTyr Pro Ile His His Gly Asp Pro Ser Tyr 245 250 255 Arg Thr Leu Lys GluThr Gln Pro Trp Ser Ser Pro Lys Gly Ser Glu 260 265 270 Gly Tyr Leu AlaAla Thr Tyr Pro Thr Val Gly Gln Thr Ser Pro Arg 275 280 285 Ala Arg LysSer Met Ser Leu Asp Met Gly Gln Pro Ser Gln Ala Asn 290 295 300 Thr LysLys Leu Leu Gly Thr Arg Lys Ser Phe Asp His Leu Ile Ser 305 310 315 320Asp Thr Lys Ala Pro Lys Arg Gln Glu Met Glu Ser Gly Ile Thr Thr 325 330335 Pro Pro Lys Met Arg Arg Val Ala Glu Thr Asp Tyr Glu Met Glu Thr 340345 350 Gln Arg Ile Ser Ser Ser Gln Gln His Pro His Leu Arg Lys Val Ser355 360 365 Val Ser Glu Ser Asn Val Leu Leu Asp Glu Glu Val Leu Thr AspPro 370 375 380 Lys Ile Gln Ala Leu Leu Leu Thr Val Leu Ala Thr Leu ValLys Tyr 385 390 395 400 Thr Thr Asp Glu Phe Asp Gln Arg Ile Leu Tyr GluTyr Leu Ala Glu 405 410 415 Ala Ser Val Val Phe Pro Lys Val Phe Pro ValVal His Asn Leu Leu 420 425 430 Asp Ser Lys Ile Asn Thr Leu Leu Ser LeuCys Gln Asp Pro Asn Leu 435 440 445 Leu Asn Pro Ile His Gly Ile Val GlnSer Val Val Tyr His Glu Glu 450 455 460 Ser Pro Pro Gln Tyr Gln Thr SerTyr Leu Gln Ser Phe Gly Phe Asn 465 470 475 480 Gly Leu Trp Arg Phe AlaGly Pro Phe Ser Lys Gln Thr Gln Ile Pro 485 490 495 Asp Tyr Ala Glu LeuIle Val Lys Phe Leu Asp Ala Leu Ile Asp Thr 500 505 510 Tyr Leu Pro GlyIle Asp Glu Glu Thr Ser Glu Glu Ser Leu Leu Thr 515 520 525 Pro Thr SerPro Tyr Pro Pro Ala Leu Gln Ser Gln Leu Ser Ile Thr 530 535 540 Ala AsnLeu Asn Leu Ser Asn Ser Met Thr Ser Leu Ala Thr Ser Gln 545 550 555 560His Ser Pro Gly Ile Asp Lys Glu Asn Val Glu Leu Ser Pro Thr Thr 565 570575 Gly His Cys Asn Ser Gly Arg Thr Arg His Gly Ser Ala Ser Gln Val 580585 590 Gln Lys Gln Arg Ser Ala Gly Ser Phe Lys Arg Asn Ser Ile Lys Lys595 600 605 Ile Val 610 1212 base pairs nucleic acid single linear cDNANO NO Homo Sapiens CDS 211..1212 misc_feature 52..54 /note= “Upstream inframe stop codon” misc_feature 98..119 /note= “Oligonucleotide used forprimer extension” misc_feature (270{circumflex over ( )}271) /note=“Position of the first intron and alternate sequences(SEQ ID NO6 throughSEQ ID NO8) diverge” 5 CCCCAGCCTC CTTGCCAACC CCCCTTTCCC TCTCCCCCTCCCGCTCGGCG CTGACCCCCC 60 ATCCCCACCC CCGTGGGAAC ACTGGGAGCC TGCACTCCACAGACCCTCTC CTTGCCTCTT 120 CCCTCACCTC AGCCTCCGCT CCCCGCCCTC TTCCCGGCCCAGGGCGCCGG CCCACCCTTC 180 CCTCCGCCGC CCCCCGGCCG CGGGGAGGAC ATG GCC GCGCAC AGG CCG GTG GAA 234 Met Ala Ala His Arg Pro Val Glu 615 TGG GTC CAGGCC GTG GTC AGC CGC TTC GAC GAG CAG CTT CCA ATA AAA 282 Trp Val Gln AlaVal Val Ser Arg Phe Asp Glu Gln Leu Pro Ile Lys 620 625 630 635 ACA GGACAG CAG AAC ACA CAT ACC AAA GTC AGT ACT GAG CAC AAC AAG 330 Thr Gly GlnGln Asn Thr His Thr Lys Val Ser Thr Glu His Asn Lys 640 645 650 GAA TGTCTA ATC AAT ATT TCC AAA TAC AAG TTT TCT TTG GTT ATA AGC 378 Glu Cys LeuIle Asn Ile Ser Lys Tyr Lys Phe Ser Leu Val Ile Ser 655 660 665 GGC CTCACT ACT ATT TTA AAG AAT GTT AAC AAT ATG AGA ATA TTT GGA 426 Gly Leu ThrThr Ile Leu Lys Asn Val Asn Asn Met Arg Ile Phe Gly 670 675 680 GAA GCTGCT GAA AAA AAT TTA TAT CTC TCT CAG TTG ATT ATA TTG GAT 474 Glu Ala AlaGlu Lys Asn Leu Tyr Leu Ser Gln Leu Ile Ile Leu Asp 685 690 695 ACA CTGGAA AAA TGT CTT GCT GGG CAA CCA AAG GAC ACA ATG AGA TTA 522 Thr Leu GluLys Cys Leu Ala Gly Gln Pro Lys Asp Thr Met Arg Leu 700 705 710 715 GATGAA ACG ATG CTG GTC AAA CAG TTG CTG CCA GAA ATC TGC CAT TTT 570 Asp GluThr Met Leu Val Lys Gln Leu Leu Pro Glu Ile Cys His Phe 720 725 730 CTTCAC ACC TGT CGT GAA GGA AAC CAG CAT GCA GCT GAA CTT CGG AAT 618 Leu HisThr Cys Arg Glu Gly Asn Gln His Ala Ala Glu Leu Arg Asn 735 740 745 TCTGCC TCT GGG GTT TTA TTT TCT CTC AGC TGC AAC AAC TTC AAT GCA 666 Ser AlaSer Gly Val Leu Phe Ser Leu Ser Cys Asn Asn Phe Asn Ala 750 755 760 GTCTTT AGT CGC ATT TCT ACC AGG TTA CAG GAA TTA ACT GTT TGT TCA 714 Val PheSer Arg Ile Ser Thr Arg Leu Gln Glu Leu Thr Val Cys Ser 765 770 775 GAAGAC AAT GTT GAT GTT CAT GAT ATA GAA TTG TTA CAG TAT ATC AAT 762 Glu AspAsn Val Asp Val His Asp Ile Glu Leu Leu Gln Tyr Ile Asn 780 785 790 795GTG GAT TGT GCA AAA TTA AAA CGA CTC CTG AAG GAA ACA GCA TTT AAA 810 ValAsp Cys Ala Lys Leu Lys Arg Leu Leu Lys Glu Thr Ala Phe Lys 800 805 810TTT AAA GCC CTA AAG AAG GTT GCG CAG TTA GCA GTT ATA AAT AGC CTG 858 PheLys Ala Leu Lys Lys Val Ala Gln Leu Ala Val Ile Asn Ser Leu 815 820 825GAA AAG GCA TTT TGG AAC TGG GTA GAA AAT TAT CCA GAT GAA TTT ACA 906 GluLys Ala Phe Trp Asn Trp Val Glu Asn Tyr Pro Asp Glu Phe Thr 830 835 840AAA CTG TAC CAG ATC CCA CAG ACT GAT ATG GCT GAA TGT GCA GAA AAG 954 LysLeu Tyr Gln Ile Pro Gln Thr Asp Met Ala Glu Cys Ala Glu Lys 845 850 855CTA TTT GAC TTG GTG GAT GGT TTT GCT GAA AGC ACC AAA CGT AAA GCA 1002 LeuPhe Asp Leu Val Asp Gly Phe Ala Glu Ser Thr Lys Arg Lys Ala 860 865 870875 GCA GTT TGG CCA CTA CAA ATC ATT CTC CTT ATC TTG TGT CCA GAA ATA 1050Ala Val Trp Pro Leu Gln Ile Ile Leu Leu Ile Leu Cys Pro Glu Ile 880 885890 ATC CAG GAT ATA TCC AAA GAC GTG GTT GAT GAA AAC AAC ATG AAT AAG 1098Ile Gln Asp Ile Ser Lys Asp Val Val Asp Glu Asn Asn Met Asn Lys 895 900905 AAG TTA TTT CTG GAC AGT CTA CGA AAA GCT CTT GCT GGC CAT GGA GGA 1146Lys Leu Phe Leu Asp Ser Leu Arg Lys Ala Leu Ala Gly His Gly Gly 910 915920 AGT AGG CAG CTG ACA GAA AGT GCT GCA ATT GCC TGT GTC AAA CTG TGT 1194Ser Arg Gln Leu Thr Glu Ser Ala Ala Ile Ala Cys Val Lys Leu Cys 925 930935 AAA GCA AGT ACT TAC ATC 1212 Lys Ala Ser Thr Tyr Ile 940 945 334amino acids amino acid linear protein unknown 6 Met Ala Ala His Arg ProVal Glu Trp Val Gln Ala Val Val Ser Arg 1 5 10 15 Phe Asp Glu Gln LeuPro Ile Lys Thr Gly Gln Gln Asn Thr His Thr 20 25 30 Lys Val Ser Thr GluHis Asn Lys Glu Cys Leu Ile Asn Ile Ser Lys 35 40 45 Tyr Lys Phe Ser LeuVal Ile Ser Gly Leu Thr Thr Ile Leu Lys Asn 50 55 60 Val Asn Asn Met ArgIle Phe Gly Glu Ala Ala Glu Lys Asn Leu Tyr 65 70 75 80 Leu Ser Gln LeuIle Ile Leu Asp Thr Leu Glu Lys Cys Leu Ala Gly 85 90 95 Gln Pro Lys AspThr Met Arg Leu Asp Glu Thr Met Leu Val Lys Gln 100 105 110 Leu Leu ProGlu Ile Cys His Phe Leu His Thr Cys Arg Glu Gly Asn 115 120 125 Gln HisAla Ala Glu Leu Arg Asn Ser Ala Ser Gly Val Leu Phe Ser 130 135 140 LeuSer Cys Asn Asn Phe Asn Ala Val Phe Ser Arg Ile Ser Thr Arg 145 150 155160 Leu Gln Glu Leu Thr Val Cys Ser Glu Asp Asn Val Asp Val His Asp 165170 175 Ile Glu Leu Leu Gln Tyr Ile Asn Val Asp Cys Ala Lys Leu Lys Arg180 185 190 Leu Leu Lys Glu Thr Ala Phe Lys Phe Lys Ala Leu Lys Lys ValAla 195 200 205 Gln Leu Ala Val Ile Asn Ser Leu Glu Lys Ala Phe Trp AsnTrp Val 210 215 220 Glu Asn Tyr Pro Asp Glu Phe Thr Lys Leu Tyr Gln IlePro Gln Thr 225 230 235 240 Asp Met Ala Glu Cys Ala Glu Lys Leu Phe AspLeu Val Asp Gly Phe 245 250 255 Ala Glu Ser Thr Lys Arg Lys Ala Ala ValTrp Pro Leu Gln Ile Ile 260 265 270 Leu Leu Ile Leu Cys Pro Glu Ile IleGln Asp Ile Ser Lys Asp Val 275 280 285 Val Asp Glu Asn Asn Met Asn LysLys Leu Phe Leu Asp Ser Leu Arg 290 295 300 Lys Ala Leu Ala Gly His GlyGly Ser Arg Gln Leu Thr Glu Ser Ala 305 310 315 320 Ala Ile Ala Cys ValLys Leu Cys Lys Ala Ser Thr Tyr Ile 325 330 60 base pairs nucleic acidsingle linear unknown misc_feature group(28..34, 59..60) /note=“Consensus AG and lariat sequences” 7 TATTTATGGT CGTTTTTAAG GATAAGCTGTTAACGTGTTT TTTTTTTCTT TTTTTTTCAG 60 84 base pairs nucleic acid singlelinear unknown CDS 1..84 misc_feature (60{circumflex over ( )}61) /note=“Position of the first splice junction separating exon 2 to the rightand exon 1 to the left” 8 ATG GCC GCG CAC AGG CCG GTG GAA TGG GTC CAGGCC GTG GTC AGC CGC 48 Met Ala Ala His Arg Pro Val Glu Trp Val Gln AlaVal Val Ser Arg 335 340 345 350 TTC GAC GAG CAG CTT CCA ATA AAA ACA GGACAG CAG 84 Phe Asp Glu Gln Leu Pro Ile Lys Thr Gly Gln Gln 355 360 28amino acids amino acid linear protein unknown 9 Met Ala Ala His Arg ProVal Glu Trp Val Gln Ala Val Val Ser Arg 1 5 10 15 Phe Asp Glu Gln LeuPro Ile Lys Thr Gly Gln Gln 20 25 60 base pairs nucleic acid singlelinear unknown CDS 19..60 misc_feature 16..18 /note= “In frame stopcodon” 10 AGCCTCTTGT GGCTTTGA ATT TTG TTT CAT CAA TTC CTA GGG TTT TGGCAA 51 Ile Leu Phe His Gln Phe Leu Gly Phe Trp Gln 30 35 CTT CTC CTG 60Leu Leu Leu 40 14 amino acids amino acid linear protein unknown 11 IleLeu Phe His Gln Phe Leu Gly Phe Trp Gln Leu Leu Leu 1 5 10 22 base pairsnucleic acid single linear unknown 12 AGAGGCAAGG AGAGGGTCTG TG 22 18amino acids amino acid single linear unknown 13 Ala Ser Leu Pro Cys SerAsn Ser Ala Val Phe Met Gln Leu Phe Pro 1 5 10 15 His Gln 21 amino acidsamino acid single linear unknown 14 Ala Thr Cys His Ser Leu Leu Asn LysAla Thr Val Lys Glu Lys Lys 1 5 10 15 Glu Asn Lys Lys Ser 20

What is claimed is:
 1. A nucleic acid sequence comprising SEQ ID NO: 3.2. A nucleic acid sequence coding for the amino acid sequence of SEQ IDNO.
 2. 3. A nucleic acid sequence complementary to the sequence of claim1 or claim
 2. 4. A nucleic acid sequence complementary to the nucleicacid sequence of claim
 2. 5. The transcript of the sequence of claim 1.6. A transcript of the nucleic acid sequence of claim
 2. 7. A nucleicacid probe or primer comprising at least 20 contiguous nucleotides ofthe nucleic acid sequence of SEQ ID NO.
 3. 8. The complementary sequenceof the probe or primer of claim
 7. 9. Probe B3A (SEQ ID NO: 1).
 10. Anucleic acid sequence complementary to the probe of claim
 9. 11.Purified NF1 transcript.
 12. Isolated DNA template sequence for thetranscript of claim
 11. 13. The DNA sequence substantially complementaryto the template sequence of claim
 12. 14. The transcript of claim 11,wherein the transcript is not of human in vivo origin.
 15. A method ofscreening a patient for NF1 comprising the steps of: a) providing aprobe comprising at least 20 contiguous nucleotides of the NF1 gene orits transcript; b) obtaining a sample of the patient's blood or tissueof the type in which NF1 is normally expressed; c) contacting the sampleunder conditions favorable to hybridization of the probe to anycomplementary sequences of nucleic acid in the sample; d) providingmeans for detection of hybridization; and e) detecting hybridization,the presence of hybridized probe being a positive screen for NF1. 16.The method of claim 15 further comprising the steps of: f) providingmeans for quantifying hybridization of the probe to complementarysequences; and g) quantifying hybridization.
 17. A nucleic acid sequencecomprising the open reading frame of SEQ ID NO:
 1. 18. A nucleic acidsequence complementary to the sequence of claim
 17. 19. The transcriptof the sequence of claim
 17. 20. A nucleic acid probe or primerconsisting of at least 20 contiguous nucleotides of SEQ ID NO.
 3. 21.The complementary sequence of the probe of claim
 20. 22. An assay kitfor screening for NF1 comprising: a) a nucleic acid probe or primer ofclaims 6 or 7; b) reagents for hybridization of the probe to acomplementary nucleic acid sequence; and c) means for detectinghybridization.
 23. A nucleic acid sequence consisting of the sequence ofprobe P5 (SEQ ID NO: 3).
 24. A method of detecting the presence ofwild-type NF1 gene in an individual, the method comprising: a) providingnucleic acid from a sample of normal or pathological tissue from theindividual in which NF1 is likely to be expressed; b) providing at leasttwo primers comprising at least 20 nucleotides, the primers beingspecific for NF1 DNA; c) performing a polymerase chain reaction (PCR)using the nucleic acids from steps (a) and (b); and d) detecting andanalyzing the products of PCR for the presence of wild-type NF1 gene inthe individual.
 25. The method of claim 24, wherein at least one of theprimers of step b) is the primer of claim 6 or the complement to theprimer of claim
 6. 26. The method of claim 24, wherein the step ofdetecting and analyzing the products of PCR comprises comparing thesequence of the PCR product against the sequence of a nucleic acidsequence comprising SEQ ID. NO. 3 or the complement to the nucleic acidsequence comprising SEQ ID. NO: 3.